Thread: MultiXact bugs
Hi, The attached pgbench testcase can reproduce two issues: 1) (takes a bit) TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601) That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters and returns InvalidTransactionId in that case, but HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS... Looks like a 9.3+ issue, and shouldn't have a ll that high impact in non-assert builds, page pruning will be delayed a bit. 2) we frequently error out in heap_lock_updated_tuple_rec() with ERROR: unable to fetch updated version of tuple That's because when we're following a ctid chain, it's perfectly possible for the updated version of a tuple to already have been vacuumed/pruned away if the the updating transaction has aborted. Also looks like a 9.3+ issues to me, slightly higher impact, but in the end you're just getting some errors under concurrency. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund wrote: Hi, > The attached pgbench testcase can reproduce two issues: Thanks. > 2) we frequently error out in heap_lock_updated_tuple_rec() with > ERROR: unable to fetch updated version of tuple > > That's because when we're following a ctid chain, it's perfectly > possible for the updated version of a tuple to already have been > vacuumed/pruned away if the the updating transaction has aborted. > > Also looks like a 9.3+ issues to me, slightly higher impact, but in the > end you're just getting some errors under concurrency. Yes, I think this is 9.3 only. First attachment shows my proposed patch, which is just to report success to caller in case the tuple cannot be acquired by heap_fetch. This is OK because if by the time we check the updated version of a tuple it is gone, it means there is no further update chain to follow and lock. > 1) (takes a bit) > TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601) > > That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters > and returns InvalidTransactionId in that case, but > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS... Interesting. This is a very narrow race condition: when we call HeapTupleSafisfiesVacuum the updater is still running, so it returns HEAPTUPLE_DELETE_IN_PROGRESS; but it aborts just before we read the tuple's update Xid. So that one returns InvalidXid and that causes the failure. I was able to hit this race condition very quickly by adding a pg_usleep(1000) in heap_prune_chain(), in the HEAPTUPLE_DELETE_IN_PROGRESS case, before trying to acquire the update Xid. There is no way to close the window, but there is no need; if the updater aborted, we don't need to mark the page prunable in the first place. So we can just check the return value of HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the prunable bit. The second attachment below fixes the bug that way. I checked for other cases where the update Xid is checked after HeapTupleSatisfiesVacuum returns HEAPTUPLE_DELETE_IN_PROGRESS. As far as I can tell, the only one that would be affected is the one in predicate.c. It is far from clear to me what is the right thing to do in these cases; the simplest idea is to return without reporting a failure if the updater aborted, just as above; but I wonder if this needs to be conditional on "visible". I added a pg_usleep() before acquiring the update Xid in the relevant case, but the isolation test cases didn't hit the problem (I presume there is no update/delete in these test cases, but I didn't check). I defer to Kevin on this issue. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote: > > 2) we frequently error out in heap_lock_updated_tuple_rec() with > > ERROR: unable to fetch updated version of tuple > > > > That's because when we're following a ctid chain, it's perfectly > > possible for the updated version of a tuple to already have been > > vacuumed/pruned away if the the updating transaction has aborted. > > > > Also looks like a 9.3+ issues to me, slightly higher impact, but in the > > end you're just getting some errors under concurrency. > > Yes, I think this is 9.3 only. First attachment shows my proposed > patch, which is just to report success to caller in case the tuple > cannot be acquired by heap_fetch. This is OK because if by the time we > check the updated version of a tuple it is gone, it means there is no > further update chain to follow and lock. Looks good. > > 1) (takes a bit) > > TRAP: FailedAssertion("!(((xid) >= ((TransactionId) 3)))", File:/pruneheap.c", Line: 601) > > > > That's because HeapTupleHeaderGetUpdateXid() ignores aborted updaters > > and returns InvalidTransactionId in that case, but > > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS... > > Interesting. This is a very narrow race condition: when we call > HeapTupleSafisfiesVacuum the updater is still running, so it returns > HEAPTUPLE_DELETE_IN_PROGRESS; but it aborts just before we read the > tuple's update Xid. Well, it's not *that* narrow - remember that a transaction is marked as aborted in the clog *before* it is removed from the proc array. > There is no way to close the window, but there is no need; if the > updater aborted, we don't need to mark the page prunable in the first > place. So we can just check the return value of > HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the > prunable bit. The second attachment below fixes the bug that way. I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks for aborted transactions in the first place. Why is that a good idea? ISTM that wanders off a fair bit from the other HeapTupleHeaderGet* macros. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote: > > There is no way to close the window, but there is no need; if the > > updater aborted, we don't need to mark the page prunable in the first > > place. So we can just check the return value of > > HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the > > prunable bit. The second attachment below fixes the bug that way. > > I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks > for aborted transactions in the first place. Why is that a good idea? > ISTM that wanders off a fair bit from the other HeapTupleHeaderGet* > macros. Originally it didn't, which caused various bugs. I recall it turned out to be cleaner to do the check inside it than putting it out to its callers. I have thoughts that this design might break other things such as the priorXmax checking while traversing HOT chains. Not seeing how: surely if there's an aborted updater in a tuple, there can't be a followup HOT chain elsewhere involving the same tuple. A HOT chain would require another updater Xid in the MultiXact (and we ensure there can only be one updater in a multi). I might be missing something. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2013-11-25 13:45:53 -0300, Alvaro Herrera wrote: > Andres Freund wrote: > > On 2013-11-25 12:36:19 -0300, Alvaro Herrera wrote: > > > > There is no way to close the window, but there is no need; if the > > > updater aborted, we don't need to mark the page prunable in the first > > > place. So we can just check the return value of > > > HeapTupleHeaderGetUpdateXid and if it's InvalidXid, don't set the > > > prunable bit. The second attachment below fixes the bug that way. > > > > I am not sure I like the fact that HeapTupleHeaderGetUpdateXid() checks > > for aborted transactions in the first place. Why is that a good idea? > > ISTM that wanders off a fair bit from the other HeapTupleHeaderGet* > > macros. > > Originally it didn't, which caused various bugs. I recall it turned out > to be cleaner to do the check inside it than putting it out to its > callers. This seems strange to me because we do *not* do those checks for !IS_MULTI xmax's. So there surely shouldn't be too many callers caring about that? > I have thoughts that this design might break other things such as the > priorXmax checking while traversing HOT chains. Not seeing how: Hm. Are you arguing with yourself about this? > surely > if there's an aborted updater in a tuple, there can't be a followup HOT > chain elsewhere involving the same tuple. A HOT chain would require > another updater Xid in the MultiXact (and we ensure there can only be > one updater in a multi). I might be missing something. I don't see dangers that way either. I think there might be some strange behaviour because callers need to check IsRunning() first though, which MultiXactIdGetUpdateXid() doesn't. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Andres Freund wrote: >> That's because HeapTupleHeaderGetUpdateXid() ignores aborted >> updaters and returns InvalidTransactionId in that case, but >> HeapTupleSatisfiesVacuum() returns >> HEAPTUPLE_DELETE_IN_PROGRESS... > I checked for other cases where the update Xid is checked after > HeapTupleSatisfiesVacuum returns HEAPTUPLE_DELETE_IN_PROGRESS. > As far as I can tell, the only one that would be affected is the > one in predicate.c. It is far from clear to me what is the right > thing to do in these cases; the simplest idea is to return > without reporting a failure if the updater aborted, just as > above; but I wonder if this needs to be conditional on "visible". > I added a pg_usleep() before acquiring the update Xid in the > relevant case, but the isolation test cases didn't hit the > problem (I presume there is no update/delete in these test cases, > but I didn't check). I defer to Kevin on this issue. Right now if HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS we call HeapTupleHeaderGetUpdateXid(tuple->t_data) and Assert() that the result is valid. It sounds like we should do something like the attached, maybe? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Andres Freund <andres@2ndquadrant.com> wrote: > HeapTupleHeaderGetUpdateXid() ignores aborted updaters > and returns InvalidTransactionId in that case, but > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS... That sure *sounds* like it should cause a problem for this code in CheckForSerializableConflictOut(): htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer); switch (htsvResult) { [ ... ] case HEAPTUPLE_DELETE_IN_PROGRESS: xid = HeapTupleHeaderGetUpdateXid(tuple->t_data); break; [ ... ] } Assert(TransactionIdIsValid(xid)); ... however, I have not been able to trigger that Assert even with gdb breakpoints at what I think are the right spots. Any suggestions? How far back is it true that the above HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same tuple structure can return InvalidTransactionId? Is there some other condition (besides a ROLLBACK of an UPDATE on the tuple being read) which needs to be met? Is any particular timing necessary? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > > HeapTupleHeaderGetUpdateXid() ignores aborted updaters > > and returns InvalidTransactionId in that case, but > > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DELETE_IN_PROGRESS... > > That sure *sounds* like it should cause a problem for this code in > CheckForSerializableConflictOut(): Yea. IMNSHO the current state is a API design flaw. We really should be returning the aborted xid since that's what happens for non-multixact ids. > htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer); > switch (htsvResult) > { > [ ... ] > case HEAPTUPLE_DELETE_IN_PROGRESS: > xid = HeapTupleHeaderGetUpdateXid(tuple->t_data); > break; > [ ... ] > } > Assert(TransactionIdIsValid(xid)); > > ... however, I have not been able to trigger that Assert even with > gdb breakpoints at what I think are the right spots. Any > suggestions? How far back is it true that the above > HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS > but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same > tuple structure can return InvalidTransactionId? Is ther What do you mean with "how far back"? > e some > other condition (besides a ROLLBACK of an UPDATE on the tuple being > read) which needs to be met? Is any particular timing necessary? Afaics you need a multixact consisting out of a) the updater and b) a lock. That's probably easiest to get if you update a row in one session without changing the primary key, and then key-share lock it in another. Or the other way round. Then abort the updater. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-11-27 15:14:11 -0800, Kevin Grittner wrote: >> ... however, I have not been able to trigger that Assert even with >> gdb breakpoints at what I think are the right spots. Any >> suggestions? How far back is it true that the above >> HeapTupleSatisfiesVacuum() can return HEAPTUPLE_DELETE_IN_PROGRESS >> but HeapTupleHeaderGetUpdateXid(tuple->t_data) on the exact same >> tuple structure can return InvalidTransactionId? > > What do you mean with "how far back"? What back-patching will be needed for a fix? It sounds like 9.3? > Afaics you need a multixact consisting out of a) the updater and b) a > lock. That's probably easiest to get if you update a row in one session > without changing the primary key, and then key-share lock it in > another. Or the other way round. > Then abort the updater. Thanks! I'll keep trying to generate a failure at that point. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > What do you mean with "how far back"? > > What back-patching will be needed for a fix? It sounds like 9.3? Yep. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote: >> What back-patching will be needed for a fix? It sounds like >> 9.3? > > Yep. In going over this, I found pre-existing bugs when a tuple was both inserted and deleted by concurrent transactions, but fixing that is too invasive to consider for Monday's minor release lockdown. The attached seems very safe to me, and protects against some new hazards related to the subtransaction changes (mostly just for an assert-enabled build, but still worth fixing). It includes a lot of work on the comments, to guide the subsequent fixes or other work in that area. If nobody objects, I will push it to master and 9.3 tomorrow. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On 2013-11-29 13:14:06 -0800, Kevin Grittner wrote: > Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-11-27 15:42:11 -0800, Kevin Grittner wrote: > > >> What back-patching will be needed for a fix? It sounds like > >> 9.3? > > > > Yep. > > In going over this, I found pre-existing bugs when a tuple was both > inserted and deleted by concurrent transactions, but fixing that is > too invasive to consider for Monday's minor release lockdown. The > attached seems very safe to me, and protects against some new > hazards related to the subtransaction changes (mostly just for an > assert-enabled build, but still worth fixing). It includes a lot > of work on the comments, to guide the subsequent fixes or other > work in that area. > If nobody objects, I will push it to master and 9.3 tomorrow. Alvaro is about to commit a patch that will remove the behaviour of GetUpdateXid() to check for IdDidAbort() because it makes other fixes really complicated and it's a really suprising behaviour. But most of your patch shouldn't be affected by that. Check http://archives.postgresql.org/message-id/20131129193008.GP5513%40eldon.alvh.no-ip.org for the current state of the series. Looking at predicate.c I think I see a bigger problem though: Isn't its usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the transaction's own cutoff, not the global cutoff that will cause wrong hint bits to be set. Or am I missing something? HTSV's comment says:** OldestXmin is a cutoff XID (obtained from GetOldestXmin()). Tuples* deleted by XIDs >= OldestXminare deemed "recently dead"; they might* still be visible to some open transaction, so we can't remove them,* evenif we see that the deleting transaction has committed.*/ Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2013-11-29 22:27:16 +0100, Andres Freund wrote: > Looking at predicate.c I think I see a bigger problem though: Isn't its > usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes > TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the > transaction's own cutoff, not the global cutoff that will cause wrong > hint bits to be set. Or am I missing something? > HTSV's comment says: > * > * OldestXmin is a cutoff XID (obtained from GetOldestXmin()). Tuples > * deleted by XIDs >= OldestXmin are deemed "recently dead"; they might > * still be visible to some open transaction, so we can't remove them, > * even if we see that the deleting transaction has committed. > */ Strike that, sorry for the noise. I was thinking there was some conditional hint bit setting based on OldestXmin in there, but I was misremembering. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> wrote: > Looking at predicate.c I think I see a bigger problem though: Isn't its > usage of HeapTupleSatisfiesVacuum() quite dangerous? It passes > TransactionXmin to HeapTupleSatisfiesVacuum(). But since that's just the > transaction's own cutoff, not the global cutoff that will cause wrong > hint bits to be set. Or am I missing something? I don't see where that parameter has anything to do with setting hint bits; it only seems to affect the return code for the caller. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@ymail.com> wrote: > In going over this, I found pre-existing bugs when a tuple was both > inserted and deleted by concurrent transactions, but fixing that is > too invasive to consider for Monday's minor release lockdown. The > attached seems very safe to me, and protects against some new > hazards related to the subtransaction changes (mostly just for an > assert-enabled build, but still worth fixing). It includes a lot > of work on the comments, to guide the subsequent fixes or other > work in that area. > > If nobody objects, I will push it to master and 9.3 tomorrow. Given the recent MultiXact patches, I no longer see any bugs in SSI which have not been there since the start, and these are in such remote corner cases that there have so far been no reports of anyone hitting them. I will wait until after this emergency release to patch those; the risk/benefit ratio of doing something so quickly just doesn't seem good. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company