Thread: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
[HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Now that we've got consistent failure reports about the 009_twophase.pl recovery test, I set out to find out why it's failing. It looks to me like the reason is that this (twophase.c:2145): SubTransSetParent(xid, subxid, overwriteOK); ought to be this: SubTransSetParent(subxid, xid, overwriteOK); because the definition of SubTransSetParent is void SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK) not the other way 'round. While "git blame" blames this line on the recent commit 728bd991c, that just moved the call from somewhere else. AFAICS this has actually been wrong since StandbyRecoverPreparedTransactions was written, in 361bd1662 of 2010-04-13. It's not clear to me how much potential this has to create user data corruption, but it doesn't look good at first glance. Discuss. Also, when I fix that, it gets further but still crashes at the same Assert in SubTransSetParent. The proximate cause this time seems to be that RecoverPreparedTransactions's calculation of overwriteOK is wrong: it's computing that as "false", but in reality the subtrans link in question has already been set. regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Andres Freund
Date:
On 2017-04-22 19:55:18 -0400, Tom Lane wrote: > Now that we've got consistent failure reports about the 009_twophase.pl > recovery test, I set out to find out why it's failing. It looks to me > like the reason is that this (twophase.c:2145): > > SubTransSetParent(xid, subxid, overwriteOK); > > ought to be this: > > SubTransSetParent(subxid, xid, overwriteOK); > > because the definition of SubTransSetParent is > > void > SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK) > > not the other way 'round. > > While "git blame" blames this line on the recent commit 728bd991c, > that just moved the call from somewhere else. AFAICS this has actually > been wrong since StandbyRecoverPreparedTransactions was written, > in 361bd1662 of 2010-04-13. > Also, when I fix that, it gets further but still crashes at the same > Assert in SubTransSetParent. The proximate cause this time seems to be > that RecoverPreparedTransactions's calculation of overwriteOK is wrong: > it's computing that as "false", but in reality the subtrans link in > question has already been set. > Yikes. This is clearly way undertested. It's also pretty scary that the code has recently been whacked out quite heavily (both 9.6 and master), without hitting anything around this - doesn't seem to bode well for how in-depth the testing was. > It's not clear to me how much potential this has to create user data > corruption, but it doesn't look good at first glance. Discuss. Hm. I think it can cause wrong tqual.c results in some edge cases. During HS, lastOverflowedXid will be set in some cases, and then XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in turn cause lookups snapshot->subxip (where all HS xids reside) potentially return wrong results. I was about to say that I don't see how it could result in persistent corruption however - the subtrans lookups are only done for (snapshot->xmin, snapshot->xmax] and subtrans is truncated regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory anymore, so that might be delayed. Hm. - Andres
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 23 April 2017 at 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Now that we've got consistent failure reports about the 009_twophase.pl > recovery test, I set out to find out why it's failing. It looks to me > like the reason is that this (twophase.c:2145): > > SubTransSetParent(xid, subxid, overwriteOK); > > ought to be this: > > SubTransSetParent(subxid, xid, overwriteOK); > > because the definition of SubTransSetParent is > > void > SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK) > > not the other way 'round. > > While "git blame" blames this line on the recent commit 728bd991c, > that just moved the call from somewhere else. AFAICS this has actually > been wrong since StandbyRecoverPreparedTransactions was written, > in 361bd1662 of 2010-04-13. Thanks for finding that. > It's not clear to me how much potential this has to create user data > corruption, but it doesn't look good at first glance. Discuss. Well, strangely for different reason I was looking at that particular call a couple of days back. I didn't spot that issue, but I was thinking why we even make that call at that time. My conclusion was that it could be optimized away mostly, since it isn't needed very often, but its not really worth optimizing. SubTransSetParent() only matters for the case where we are half-way through a commit with a large commit. Since we do atomic updates of commit and subcommmit when on same page, this problem would only matter when top level xid and other subxids were separated across multiple pages and the issue would only affect a read only query checking visibility during the commit for that prepared transaction. Furthermore, the nature of the corruption is that we set the xid to point to the subxid; since we never mark a top-level commit as subcommitted, subtrans would never be consulted and so the corruption would not lead to any incorrect answer even during the window of exposure. So it looks to me like this error shouldn't cause a problem. We can fix that, but... > Also, when I fix that, it gets further but still crashes at the same > Assert in SubTransSetParent. The proximate cause this time seems to be > that RecoverPreparedTransactions's calculation of overwriteOK is wrong: > it's computing that as "false", but in reality the subtrans link in > question has already been set. Not sure about that, investigating. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 23 April 2017 at 01:19, Andres Freund <andres@anarazel.de> wrote: > On 2017-04-22 19:55:18 -0400, Tom Lane wrote: >> Now that we've got consistent failure reports about the 009_twophase.pl >> recovery test, I set out to find out why it's failing. It looks to me >> like the reason is that this (twophase.c:2145): >> >> SubTransSetParent(xid, subxid, overwriteOK); >> >> ought to be this: >> >> SubTransSetParent(subxid, xid, overwriteOK); >> >> because the definition of SubTransSetParent is >> >> void >> SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK) >> >> not the other way 'round. >> >> While "git blame" blames this line on the recent commit 728bd991c, >> that just moved the call from somewhere else. AFAICS this has actually >> been wrong since StandbyRecoverPreparedTransactions was written, >> in 361bd1662 of 2010-04-13. > >> Also, when I fix that, it gets further but still crashes at the same >> Assert in SubTransSetParent. The proximate cause this time seems to be >> that RecoverPreparedTransactions's calculation of overwriteOK is wrong: >> it's computing that as "false", but in reality the subtrans link in >> question has already been set. >> > > Yikes. This is clearly way undertested. It's also pretty scary that > the code has recently been whacked out quite heavily (both 9.6 and > master), without hitting anything around this - doesn't seem to bode > well for how in-depth the testing was. Obviously if there is a bug it's because tests didn't find it and therefore it is by definition undertested for that specific bug. I'm not sure what other conclusion you wish to draw, if any? >> It's not clear to me how much potential this has to create user data >> corruption, but it doesn't look good at first glance. Discuss. > > Hm. I think it can cause wrong tqual.c results in some edge cases. > During HS, lastOverflowedXid will be set in some cases, and then > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in > turn cause lookups snapshot->subxip (where all HS xids reside) > potentially return wrong results. > > I was about to say that I don't see how it could result in persistent > corruption however - the subtrans lookups are only done for > (snapshot->xmin, snapshot->xmax] and subtrans is truncated > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory > anymore, so that might be delayed. Hm. I've not found any reason, yet, to believe we return wrong answers in any case, even though the transient data structure pg_subtrans is corrupted by the switched call Tom discovers. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes: >> Also, when I fix that, it gets further but still crashes at the same >> Assert in SubTransSetParent. The proximate cause this time seems to be >> that RecoverPreparedTransactions's calculation of overwriteOK is wrong: >> it's computing that as "false", but in reality the subtrans link in >> question has already been set. > Not sure about that, investigating. As a quick hack, I just hotwired RecoverPreparedTransactions to set overwriteOK = true always, and with that and the SubTransSetParent argument-order fix, HEAD passes the recovery tests. Maybe we can be smarter than that, but this might be a good short-term fix to get the buildfarm green again. regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes: > On 23 April 2017 at 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's not clear to me how much potential this has to create user data >> corruption, but it doesn't look good at first glance. Discuss. > SubTransSetParent() only matters for the case where we are half-way > through a commit with a large commit. Since we do atomic updates of > commit and subcommmit when on same page, this problem would only > matter when top level xid and other subxids were separated across > multiple pages and the issue would only affect a read only query > checking visibility during the commit for that prepared transaction. > Furthermore, the nature of the corruption is that we set the xid to > point to the subxid; since we never mark a top-level commit as > subcommitted, subtrans would never be consulted and so the corruption > would not lead to any incorrect answer even during the window of > exposure. So it looks to me like this error shouldn't cause a problem. Still trying to wrap my head around this argument ... I agree that incorrectly setting the parent's pg_subtrans entry can't cause a visible problem, because it will never be consulted. However, failing to set the child's entry seems like it could cause transient problems. There would only be a risk during the eventual commit of the prepared transaction, and only when the pg_xact entries span multiple pages, but then there would be a window where the child xact is marked subcommitted but it has a zero entry in pg_subtrans. That would result in a WARNING from TransactionIdDidCommit, and a false result, which depending on timing might mean that commit of the overall transaction appears non-atomic to onlookers. (That is, the parent xact might already appear committed to them, but the subtransaction not yet.) I can't find any indication in the archives that anyone's ever reported seeing that WARNING, though that might only mean that they'd not noticed it. But in any case, it seems like the worst consequence is a narrow window for seeing non-atomic commit of a prepared transaction on a standby server. That seems pretty unlikely to cause real-world problems. The other point about overwriteOK not being set when it needs to be also seems like it could not cause any problems in production builds, since that flag is only examined by an Assert. regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 23 April 2017 at 17:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >>> Also, when I fix that, it gets further but still crashes at the same >>> Assert in SubTransSetParent. The proximate cause this time seems to be >>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong: >>> it's computing that as "false", but in reality the subtrans link in >>> question has already been set. > >> Not sure about that, investigating. > > As a quick hack, I just hotwired RecoverPreparedTransactions to set > overwriteOK = true always, and with that and the SubTransSetParent > argument-order fix, HEAD passes the recovery tests. Maybe we can > be smarter than that, but this might be a good short-term fix to get > the buildfarm green again. That would work. I've been looking into a fix I can explain, but "do it always" may actually be it. OK, I'll do that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 23 April 2017 at 18:41, Simon Riggs <simon@2ndquadrant.com> wrote: > On 23 April 2017 at 17:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndquadrant.com> writes: >>>> Also, when I fix that, it gets further but still crashes at the same >>>> Assert in SubTransSetParent. The proximate cause this time seems to be >>>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong: >>>> it's computing that as "false", but in reality the subtrans link in >>>> question has already been set. >> >>> Not sure about that, investigating. >> >> As a quick hack, I just hotwired RecoverPreparedTransactions to set >> overwriteOK = true always, and with that and the SubTransSetParent >> argument-order fix, HEAD passes the recovery tests. Maybe we can >> be smarter than that, but this might be a good short-term fix to get >> the buildfarm green again. > > That would work. I've been looking into a fix I can explain, but "do > it always" may actually be it. > > OK, I'll do that. Done, tests pass again for me now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Andres Freund
Date:
Hi, On 2017-04-23 11:05:44 +0100, Simon Riggs wrote: > > Yikes. This is clearly way undertested. It's also pretty scary that > > the code has recently been whacked out quite heavily (both 9.6 and > > master), without hitting anything around this - doesn't seem to bode > > well for how in-depth the testing was. > > Obviously if there is a bug it's because tests didn't find it and > therefore it is by definition undertested for that specific bug. Sure. But there's bugs that are hard to catch, and there's bugs that are easily testable. This seems to largely fall into the "relatively easy to test" camp. > I'm not sure what other conclusion you wish to draw, if any? That the changes around whacking around twophase.c in several of the last releases, didn't yield enough verified testing infrastructure. > >> It's not clear to me how much potential this has to create user data > >> corruption, but it doesn't look good at first glance. Discuss. > > > > Hm. I think it can cause wrong tqual.c results in some edge cases. > > During HS, lastOverflowedXid will be set in some cases, and then > > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in > > turn cause lookups snapshot->subxip (where all HS xids reside) > > potentially return wrong results. > > > > I was about to say that I don't see how it could result in persistent > > corruption however - the subtrans lookups are only done for > > (snapshot->xmin, snapshot->xmax] and subtrans is truncated > > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory > > anymore, so that might be delayed. Hm. > > I've not found any reason, yet, to believe we return wrong answers in > any case, even though the transient data structure pg_subtrans is > corrupted by the switched call Tom discovers. I think I pointed out a danger above. Consider what happens if query on a standby has a suboverflowed snapshot: Snapshot GetSnapshotData(Snapshot snapshot) { ... if (TransactionIdPrecedesOrEquals(xmin, procArray->lastOverflowedXid)) suboverflowed = true;} ..snapshot->suboverflowed = suboverflowed; } In that case we rely on pg_subtrans for visibility determinations: bool HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot, Buffer buffer) { ...if (!HeapTupleHeaderXminCommitted(tuple)){ ... else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) return false; and static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot) { ...if (!snapshot->takenDuringRecovery){ ...else{ ... if (snapshot->suboverflowed) { /* * Snapshot overflowed, so convert xid to top-level. This is safe * because we eliminated too-old XIDs above. */ xid = SubTransGetTopmostTransaction(xid); if the subxid->xid mapping doesn't actually exist - as it's the case with this bug afaics - we'll not get the correct toplevel transaction. Which'll mean the following block: /* * We now have either a top-level xid higher than xmin or an * indeterminate xid. We don't know whether it's top level or subxact * but it doesn't matter. If it's present, the xidis visible. */ for (j = 0; j < snapshot->subxcnt; j++) { if (TransactionIdEquals(xid, snapshot->subxip[j])) return true; } won't work correctly if suboverflowed. I don't see anything prevent wrong results here? Andres Freund
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 24 April 2017 at 00:25, Andres Freund <andres@anarazel.de> wrote: >> >> It's not clear to me how much potential this has to create user data >> >> corruption, but it doesn't look good at first glance. Discuss. >> > >> > Hm. I think it can cause wrong tqual.c results in some edge cases. >> > During HS, lastOverflowedXid will be set in some cases, and then >> > XidInMVCCSnapshot etc will not find the actual toplevel xid, which'll in >> > turn cause lookups snapshot->subxip (where all HS xids reside) >> > potentially return wrong results. >> > >> > I was about to say that I don't see how it could result in persistent >> > corruption however - the subtrans lookups are only done for >> > (snapshot->xmin, snapshot->xmax] and subtrans is truncated >> > regularly. But these days CHECKPOINT_END_OF_RECOVERY isn't obligatory >> > anymore, so that might be delayed. Hm. >> >> I've not found any reason, yet, to believe we return wrong answers in >> any case, even though the transient data structure pg_subtrans is >> corrupted by the switched call Tom discovers. > > I think I pointed out a danger above. Consider what happens if query on > a standby has a suboverflowed snapshot: > Snapshot > GetSnapshotData(Snapshot snapshot) > { > ... > if (TransactionIdPrecedesOrEquals(xmin, procArray->lastOverflowedXid)) > suboverflowed = true; > } > .. > snapshot->suboverflowed = suboverflowed; > } > > In that case we rely on pg_subtrans for visibility determinations: > bool > HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot, > Buffer buffer) > { > ... > if (!HeapTupleHeaderXminCommitted(tuple)) > { > ... > else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) > return false; > > and > static bool > XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot) > { > ... > if (!snapshot->takenDuringRecovery) > { > ... > else > { > ... > if (snapshot->suboverflowed) > { > /* > * Snapshot overflowed, so convert xid to top-level. This is safe > * because we eliminated too-old XIDs above. > */ > xid = SubTransGetTopmostTransaction(xid); > > if the subxid->xid mapping doesn't actually exist - as it's the case > with this bug afaics - we'll not get the correct toplevel > transaction. The nature of the corruption is that in some cases * a subxid will point to nothing (even though in most cases it was already set correctly) * the parent will point to a subxid So both wrong. > Which'll mean the following block: > /* > * We now have either a top-level xid higher than xmin or an > * indeterminate xid. We don't know whether it's top level or subxact > * but it doesn't matter. If it's present, the xid is visible. > */ > for (j = 0; j < snapshot->subxcnt; j++) > { > if (TransactionIdEquals(xid, snapshot->subxip[j])) > return true; > } > won't work correctly if suboverflowed. Your example of snapshots taken during recovery is not correct. Note that SubTransGetTopmostTransaction() returns a valid, running xid, even though it is the wrong one. Snapshots work differently on standbys - we store all known running xids, so the test still passes correctly, even when overflowed. I'd call that just plain luck. We behave correctly, but for the wrong reasons, at least in this case. > I don't see anything prevent wrong results here? I've had an even better look around now and I think I've found something but need to turn it into a repeatable test case so I can double-check this before reporting in full, so I don't cry wolf. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Nikhil Sontakke
Date:
>>>> Also, when I fix that, it gets further but still crashes at the same
>>>> Assert in SubTransSetParent. The proximate cause this time seems to be
>>>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
>>>> it's computing that as "false", but in reality the subtrans link in
>>>> question has already been set.
>>
>>> Not sure about that, investigating.
>>
>> As a quick hack, I just hotwired RecoverPreparedTransactions to set
>> overwriteOK = true always, and with that and the SubTransSetParent
>> argument-order fix, HEAD passes the recovery tests. Maybe we can
>> be smarter than that, but this might be a good short-term fix to get
>> the buildfarm green again.
>
> That would work. I've been looking into a fix I can explain, but "do
> it always" may actually be it.
On Master:
BEGIN;
INSERT INTO t_009_tbl VALUES (42);
SAVEPOINT s1;
INSERT INTO t_009_tbl VALUES (43);
PREPARE TRANSACTION 'xact_009_1';
On Master:
Do a fast shutdown
On Slave:
Restart the slave. This causes StandbyRecoverPreparedTransactions() to be invoked which causes ProcessTwoPhaseBuffer() to be invoked with setParent set to true. This sets up parent xid for the above subtransaction.
On Slave:
Promote the slave. This causes RecoverPreparedTransactions() to be invoked which ends up calling SubTransSetParent() for the above subxid. The overwriteOK bool remains false since we don't go over the PGPROC_MAX_CACHED_SUBXIDS limit. Since the parent xid was already set on restart above, we hit the assert.
-------------------
I am wondering if StandbyRecoverPreparedTransactions() needs the parent linkage at all? I don't see SubTransGetParent() and related functions being called on the standby but we need to be careful here. As a quick test, If I don't call SubTransSetParent() as part of StandbyRecoverPreparedTransactions() then all recovery tests pass ok. But the fact that StandbyRecoverPreparedTransactions() takes overwriteOK as a parameter means it might not be so simple..
Regards,
Nikhils
-- Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Andres Freund
Date:
On 2017-04-24 13:29:11 +0100, Simon Riggs wrote: > On 24 April 2017 at 00:25, Andres Freund <andres@anarazel.de> wrote: > > if the subxid->xid mapping doesn't actually exist - as it's the case > > with this bug afaics - we'll not get the correct toplevel > > transaction. > > The nature of the corruption is that in some cases > * a subxid will point to nothing (even though in most cases it was > already set correctly) > * the parent will point to a subxid Right. Those cases aren't that different from the point of trying to find the parent of an subxid. > > Which'll mean the following block: > > /* > > * We now have either a top-level xid higher than xmin or an > > * indeterminate xid. We don't know whether it's top level or subxact > > * but it doesn't matter. If it's present, the xid is visible. > > */ > > for (j = 0; j < snapshot->subxcnt; j++) > > { > > if (TransactionIdEquals(xid, snapshot->subxip[j])) > > return true; > > } > > won't work correctly if suboverflowed. > > Your example of snapshots taken during recovery is not correct. Oh? > Note that SubTransGetTopmostTransaction() returns a valid, running > xid, even though it is the wrong one. Sure. > Snapshots work differently on standbys - we store all known running > xids, so the test still passes correctly, even when overflowed. I don't think that's generally true. Isn't that precisely what ProcArrayStruct->lastOverflowedXid is about? If we have a snapshot that's suboverflowed due to the lastOverflowedXid cutoff, then we the subxip array does *not* contain all known running xids anymore, we rely on pg_subtrans to only guarantee that toplevel xids are stored in the KnownAssignedXids machinery. See:* When we throw away subXIDs from KnownAssignedXids, we need to keep track of* that, similarly to tracking overflow ofa PGPROC's subxids array. We do* that by remembering the lastOverflowedXID, ie the last thrown-away subXID.* As long asthat is within the range of interesting XIDs, we have to assume* that subXIDs are missing from snapshots. (Note that subXIDoverflow occurs* on primary when 65th subXID arrives, whereas on standby it occurs when 64th* subXID arrives - thatis not an error.) /* * Highest subxid that has been removed from KnownAssignedXids array to * prevent overflow; or InvalidTransactionId ifnone. We track this for * similar reasons to tracking overflowing cached subxids in PGXACT * entries. Must hold exclusiveProcArrayLock to change this, and shared * lock to read it. */TransactionId lastOverflowedXid; Greetings, Andres Freund
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 23 April 2017 at 17:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >>> Also, when I fix that, it gets further but still crashes at the same >>> Assert in SubTransSetParent. The proximate cause this time seems to be >>> that RecoverPreparedTransactions's calculation of overwriteOK is wrong: >>> it's computing that as "false", but in reality the subtrans link in >>> question has already been set. > >> Not sure about that, investigating. > > As a quick hack, I just hotwired RecoverPreparedTransactions to set > overwriteOK = true always, and with that and the SubTransSetParent > argument-order fix, HEAD passes the recovery tests. Maybe we can > be smarter than that, but this might be a good short-term fix to get > the buildfarm green again. So my thoughts are now that this is indeed the long term fix. I can't think of why it would be necessary to call SubTransSetParent() twice with two different values. A subtransaction's top level xid never changes, so pg_subtrans should be either zero or the xid of the parent and never anything else. I can't see any reason now why overwriteOK should exist at all. I'm guessing that the whole "overwriteOK" idea was an incorrect response to xids appearing where they shouldn't have done because of the mistake you just corrected. So I will now remove the parameter from the call. (I'll address the discussion of the impact of that bug in separate post). -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes: > I can't see any reason now why overwriteOK should exist at all. I'm > guessing that the whole "overwriteOK" idea was an incorrect response > to xids appearing where they shouldn't have done because of the > mistake you just corrected. So I will now remove the parameter from > the call. Seems reasonable, but I don't like the logic change you made in SubTransSetParent; you broke the former invariant, for non-Assert builds, that the target pg_subtrans entry is guaranteed to have the correct value on exit. I do like fixing it to not dirty the page unnecessarily, but I'd suggest that we write it like if (*ptr != parent){ Assert(*ptr == InvalidTransactionId); *ptr = parent; SubTransCtl->shared->page_dirty[slotno]= true;} regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 25 April 2017 at 16:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> I can't see any reason now why overwriteOK should exist at all. I'm >> guessing that the whole "overwriteOK" idea was an incorrect response >> to xids appearing where they shouldn't have done because of the >> mistake you just corrected. So I will now remove the parameter from >> the call. > > Seems reasonable, but I don't like the logic change you made in > SubTransSetParent; you broke the former invariant, for non-Assert > builds, that the target pg_subtrans entry is guaranteed to have > the correct value on exit. I do like fixing it to not dirty the > page unnecessarily, but I'd suggest that we write it like > > if (*ptr != parent) > { > Assert(*ptr == InvalidTransactionId); > *ptr = parent; > SubTransCtl->shared->page_dirty[slotno] = true; > } OK, thanks. I'll commit that tomorrow. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 24 April 2017 at 19:59, Andres Freund <andres@anarazel.de> wrote: > I don't think that's generally true. If you think that, from a risk perspective it is enough for me to continue to investigate and I have been doing that. As I said before I thought I had found a problem. I'm suggesting we take the approach that if there is a problem we can recreate it as a way of exploring what conditions are required and therefore work out the impact. Nikhil Sontakke appears to have re-created something, but not quite what I had expected. I think he will post here tomorrow with an update for us to discuss. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Andres Freund
Date:
On 2017-04-25 21:22:44 +0100, Simon Riggs wrote: > On 24 April 2017 at 19:59, Andres Freund <andres@anarazel.de> wrote: > > > I don't think that's generally true. > > If you think that, from a risk perspective it is enough for me to > continue to investigate and I have been doing that. I'm not sure it's worth digging it all that much - I think we just shouldn't claim in the release notes that the bug doesn't have any consequences, but that it's though to only matter in unlikely corner cases. - Andres
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Michael Paquier
Date:
On Wed, Apr 26, 2017 at 4:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 25 April 2017 at 16:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndquadrant.com> writes: >>> I can't see any reason now why overwriteOK should exist at all. I'm >>> guessing that the whole "overwriteOK" idea was an incorrect response >>> to xids appearing where they shouldn't have done because of the >>> mistake you just corrected. So I will now remove the parameter from >>> the call. >> >> Seems reasonable, but I don't like the logic change you made in >> SubTransSetParent; you broke the former invariant, for non-Assert >> builds, that the target pg_subtrans entry is guaranteed to have >> the correct value on exit. I do like fixing it to not dirty the >> page unnecessarily, but I'd suggest that we write it like >> >> if (*ptr != parent) >> { >> Assert(*ptr == InvalidTransactionId); >> *ptr = parent; >> SubTransCtl->shared->page_dirty[slotno] = true; >> } > > OK, thanks. I'll commit that tomorrow. A couple of comments about the last patch of this thread. Nit: there is a complain about a trailing whitespace: $ git diff --check src/backend/access/transam/twophase.c:2011: trailing whitespace. + bool fromdisk, + * It's possible that SubTransSetParent has been set before, if + * the prepared transaction generated xid assignment records. */ for (i = 0; i < hdr->nsubxacts; i++) - SubTransSetParent(subxids[i], xid, true); + SubTransSetParent(subxids[i], xid); You can remove this part in RecoverPreparedTransactions() and just switch ProcessTwoPhaseBuffer()'s setParent to true to do the same thing. The existing comment block should be moved before ProcessTwoPhaseBuffer() call. It is critical to mention that pg_subtrans is not kept after a restart. -- Michael
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Nikhil Sontakke
Date:
I'm suggesting we take the approach that if there is a problem we can
recreate it as a way of exploring what conditions are required and
therefore work out the impact. Nikhil Sontakke appears to have
re-created something, but not quite what I had expected. I think he
will post here tomorrow with an update for us to discuss.
So, I reverted commit 0874d4f3e183757ba15a4b3f3bf563e0393dd9c2 to go back to the earlier bad swapped arguments to SubTransSetParent resulting in incorrect parent linkages and used the attached TAP test patch.
The test prepares a 2PC with more than 64 subtransactions. It then stops the master and promotes the standby.
A SELECT query on the newly promoted master on any of the tables involved in the 2PC hangs. The hang is due to a loop in SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a circular reference in parentxid <-> subxid inducing the infinite loop.
Any further DML on these objects which will need to check visibility of these tuples hangs as well. All unrelated objects and new transactions are ok AFAICS.
I do not see any data loss, which is good. However tables involved in the 2PC are inaccessible till after a hard restart.
The attached TAP test patch can be considered for commit to test handling 2PC with large subtransactions on promoted standby instances.
Regards,
Nikhils
--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services
Attachment
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Nikhil Sontakke <nikhils@2ndquadrant.com> writes: > A SELECT query on the newly promoted master on any of the tables involved > in the 2PC hangs. The hang is due to a loop in > SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a > circular reference in parentxid <-> subxid inducing the infinite loop. I'm inclined to propose that (1) SubTransSetParent should contain an assert that Assert(TransactionIdFollows(xid, parent)); There is a similar assertion near one of the call sites, but that's obviously too far away from the action. (2) SubTransGetTopmostTransaction ought to contain an actual runtime test and ereport (not just Assert) that the parent XID it got from pg_subtrans precedes the child XID that was looked up. This would protect us against similar infinite loops in the future. Even without bugs, such a loop could arise due to corrupted data. regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Simon Riggs
Date:
On 26 April 2017 at 15:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Nikhil Sontakke <nikhils@2ndquadrant.com> writes: >> A SELECT query on the newly promoted master on any of the tables involved >> in the 2PC hangs. The hang is due to a loop in >> SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a >> circular reference in parentxid <-> subxid inducing the infinite loop. > > I'm inclined to propose that > > (1) SubTransSetParent should contain an assert that > Assert(TransactionIdFollows(xid, parent)); > There is a similar assertion near one of the call sites, but that's > obviously too far away from the action. > > (2) SubTransGetTopmostTransaction ought to contain an actual runtime > test and ereport (not just Assert) that the parent XID it got from > pg_subtrans precedes the child XID that was looked up. This would > protect us against similar infinite loops in the future. Even without > bugs, such a loop could arise due to corrupted data. Pretty much what I was thinking. I've added code following Michael and Tom's comments to the previous patch. New patch attached. Passes with and without Nikhil's new test. I plan to apply both patches tomorrow morning my time to give time for further comments. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes: > I've added code following Michael and Tom's comments to the previous > patch. New patch attached. Couple of minor suggestions: * Rather than deleting the comment for SubTransSetParent entirely, maybe make it say "It's possible that the parent was already recorded. However, we should never be asked to change an already-set entry to something else." * In SubTransGetTopmostTransaction, maybe it'd be better to spell "TransactionIdFollowsOrEquals" as "!TransactionIdPrecedes" to make it look more like the test just above. Matter of taste though. * Less a matter of taste is that I think that should be just elog(ERROR); there's no good reason to make it FATAL. * Also, I think there should be a comment there, along the lines of "check for reversed linkage to ensure this isn't an infinite loop". regards, tom lane
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly
From
Michael Paquier
Date:
On Thu, Apr 27, 2017 at 2:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> I've added code following Michael and Tom's comments to the previous >> patch. New patch attached. > > Couple of minor suggestions: > > * Rather than deleting the comment for SubTransSetParent entirely, > maybe make it say "It's possible that the parent was already recorded. > However, we should never be asked to change an already-set entry to > something else." > > * In SubTransGetTopmostTransaction, maybe it'd be better to spell > "TransactionIdFollowsOrEquals" as "!TransactionIdPrecedes" to make > it look more like the test just above. Matter of taste though. > > * Less a matter of taste is that I think that should be just elog(ERROR); > there's no good reason to make it FATAL. > > * Also, I think there should be a comment there, along the lines of > "check for reversed linkage to ensure this isn't an infinite loop". No more comments from here, thanks for working on the patch. -- Michael