Thread: uninterruptable loop: concurrent delete in progress within table
The attached script shows a plpgsql function that enters an infinite loop which is not interrupted by statement_timeout. Looping message is: WARNING: concurrent delete in progress within table "crash9", xid is 4458893, self is 4458894/4458889 CONTEXT: SQL statement "CREATE INDEX ON crash9 USING GIST ( the_geom)" PL/pgSQL function crash(regclass) line 148 at EXECUTE statement Note that the xid and self parts of the WARNING message were added by myself on request by "andres" (on the freenode IRC channel). I could reproduce with both 9.3.0 and 9.3.4 (current 9.3 stable branch). Sounds like a PostgreSQL bug to me, what do you think ? Is there anything I can do from the plpgsql function to workaround it ? --strk; () ASCII ribbon campaign -- Keep it simple ! /\ http://strk.keybit.net/rants/ascii_mails.txt
Sandro Santilli wrote: > The attached script shows a plpgsql function that enters > an infinite loop which is not interrupted by statement_timeout. > > Looping message is: > > WARNING: concurrent delete in progress within table "crash9", xid is 4458893, self is 4458894/4458889 > CONTEXT: SQL statement "CREATE INDEX ON crash9 USING GIST ( the_geom)" > PL/pgSQL function crash(regclass) line 148 at EXECUTE statement > > Note that the xid and self parts of the WARNING message were added by myself > on request by "andres" (on the freenode IRC channel). > > I could reproduce with both 9.3.0 and 9.3.4 (current 9.3 stable branch). > > Sounds like a PostgreSQL bug to me, what do you think ? Yes, probably, can you share the function and the WARNING patch you mention above? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > The attached script shows a plpgsql function that enters > an infinite loop which is not interrupted by statement_timeout. There's nothing attached... :) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, May 30, 2014 at 04:45:35PM +0200, Andres Freund wrote: > Hi, > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > The attached script shows a plpgsql function that enters > > an infinite loop which is not interrupted by statement_timeout. > > There's nothing attached... :) Oops, sorry. I've sent (and will send) on request to interested parties. Besides, I just tried against 9.1.13 and it fails there too (this time officially packaged by ubuntu) --strk;
On Fri, May 30, 2014 at 04:59:10PM +0200, Sandro Santilli wrote: > On Fri, May 30, 2014 at 04:45:35PM +0200, Andres Freund wrote: > > Hi, > > > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > > The attached script shows a plpgsql function that enters > > > an infinite loop which is not interrupted by statement_timeout. > > > > There's nothing attached... :) > > Oops, sorry. > I've sent (and will send) on request to interested parties. Actually, I further reduced it to not rely on PostGIS and it is attached here. --strk; () ASCII ribbon campaign -- Keep it simple ! /\ http://strk.keybit.net/rants/ascii_mails.txt
Attachment
On 2014-05-30 17:28:03 +0200, Sandro Santilli wrote: > On Fri, May 30, 2014 at 04:59:10PM +0200, Sandro Santilli wrote: > > On Fri, May 30, 2014 at 04:45:35PM +0200, Andres Freund wrote: > > > Hi, > > > > > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > > > The attached script shows a plpgsql function that enters > > > > an infinite loop which is not interrupted by statement_timeout. > > > > > > There's nothing attached... :) > > > > Oops, sorry. > > I've sent (and will send) on request to interested parties. > > Actually, I further reduced it to not rely on PostGIS > and it is attached here. Thanks. I've reproduced it and I am looking at it right now. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-05-30 17:33:22 +0200, Andres Freund wrote: > > Actually, I further reduced it to not rely on PostGIS > > and it is attached here. > > Thanks. I've reproduced it and I am looking at it right now. The problem is that HeapTupleSatisfiesVacuum() doesn't really check xmax in the TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)) case. That seems to be a longstanding omission. Thus the case where xmin = current xid, xmax = aborted subtransaction of the current backend isn't handled correctly an HEAPTUPLE_DELETE_IN_PROGRESS is returned. That in turn leads to an endless goto loop in IndexBuildHeapScan() because waiting for an aborted xact doesn't do much good. That itself is obviously easy to fix, but I wonder whether there aren't other such omissions in HeapTupleSatisfiesVacuum() - from the looks of it it wasn't originally written to support the case where the current xact performed changes itself. Not generally unreasonable, but long since obsolete. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > The attached script shows a plpgsql function that enters > an infinite loop which is not interrupted by statement_timeout. > WARNING: concurrent delete in progress within table "crash9", xid is 4458893, self is 4458894/4458889 > CONTEXT: SQL statement "CREATE INDEX ON crash9 USING GIST ( the_geom)" > PL/pgSQL function crash(regclass) line 148 at EXECUTE statement I've attached a patch including an explanatory commit message. It includes a regression test that fails with an endless loop on all supported releases before the fix is applied. Tom, Alvaro, All: Since it needs to fix HeapTupleSatisfiesVacuum() and those routines are tricky I'd very much welcome a look over the changes. As e.g. evidenced that exactly the buggy lines have been whacked around repeatedly a long time ago. Especially the slight behavioural change of HTSV deserves some review. I do wonder if any of the other existing callers of HTSV are affected. I don't understand predicate.c well enough to be sure, but it looks to me like it'd could in theory lead to missed conflicts. Seems fairly unlikely to matter in practice though. I have to say I really hate the amount of repetitive code inside the individual visibility routines. Obviously it's nothing backpatchable and may become moot to a certain degree with the CSN work, but those are pretty close to being unmaintainable. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund wrote: > I do wonder if any of the other existing callers of HTSV are affected. I > don't understand predicate.c well enough to be sure, but it looks to me > like it'd could in theory lead to missed conflicts. Seems fairly > unlikely to matter in practice though. One place where the difference in the new logic would cause a change is the tupleIsAlive bit in IndexBuildHeapScan(), when building unique indexes. Right now if the tuple is inserted by a remote running transaction, and updated by it, we return DELETE_IN_PROGRESS, so we set tupleIsAlive = false; so we wouldn't block if we see a tuple with that value elsewhere. If we instead return INSERT_IN_PROGRESS we set tupleIsAlive = true, and we would block until the inserter is done. I think it'd be better to ensure that DELETE_IN_PROGRESS is returned as appropriate in the new XidIsInProgress(xmin) block. > I have to say I really hate the amount of repetitive code inside the > individual visibility routines. Obviously it's nothing backpatchable and > may become moot to a certain degree with the CSN work, but those are > pretty close to being unmaintainable. One thing I'm now wondering while reading this code is those cases where XMAX_INVALID is not set, but the value in Xmax is InvalidXid. We have discussed these scenarios elsewhere and the conclusion seems to be that they are rare but possible and therefore we should cater for them. (IIRC the scenario is a tuple that is frozen without a full-page write and no checksums; if we crash before writing the buffer, we would recover the invalid value in Xmax but the hint bit would not become set. I think this is not possible anymore after we tweaked the freezing protocol.) So instead of a check like if (tuple->t_infomask & HEAP_XMAX_INVALID) we would have something like if (HeapTupleHeaderXmaxIsInvalid(tuple)) which checks both things -- and is easier to read, IMHO. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-02 12:48:01 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > > I do wonder if any of the other existing callers of HTSV are affected. I > > don't understand predicate.c well enough to be sure, but it looks to me > > like it'd could in theory lead to missed conflicts. Seems fairly > > unlikely to matter in practice though. > > One place where the difference in the new logic would cause a change is > the tupleIsAlive bit in IndexBuildHeapScan(), when building unique > indexes. Right now if the tuple is inserted by a remote running > transaction, and updated by it, we return DELETE_IN_PROGRESS, so we set > tupleIsAlive = false; so we wouldn't block if we see a tuple with that > value elsewhere. If we instead return INSERT_IN_PROGRESS we set > tupleIsAlive = true, and we would block until the inserter is done. Hm. Maybe I am missing something, but if either INSERT or DELETE_IN_PROGRESS is returned for a unique index in IndexBuildHeapScan() we'll wait for the xmin/xmax respectively and recheck the tuple, right? That recheck will then return a non-ephemeral status. The only change in the case you describe that I can see is whether we loop once or twice? > I think it'd be better to ensure that DELETE_IN_PROGRESS is returned as > appropriate in the new XidIsInProgress(xmin) block. That's still a valid position. > > I have to say I really hate the amount of repetitive code inside the > > individual visibility routines. Obviously it's nothing backpatchable and > > may become moot to a certain degree with the CSN work, but those are > > pretty close to being unmaintainable. > > One thing I'm now wondering while reading this code is those cases where > XMAX_INVALID is not set, but the value in Xmax is InvalidXid. We have > discussed these scenarios elsewhere and the conclusion seems to be that > they are rare but possible and therefore we should cater for them. Is there really a situation where that combination is possible for XMAX_INVALID? I don't see how. Are you possibly thinking of the case where 9.3 assumed wrongly for a while that a set XMAX_INVALID would be sufficient for freezing? > (IIRC the scenario is a tuple that is frozen without a full-page write > and no checksums; if we crash before writing the buffer, we would > recover the invalid value in Xmax but the hint bit would not become set. > I think this is not possible anymore after we tweaked the freezing > protocol.) I don't think that could have happened before either. After the crash historically we'd have gone through the same freezing routine. That'd have either left the whole thing intact (if xmin went backwards) or it'd have set HEAP_XMAX_INVALID together with setting InvalidTransactionId as xmax. > So instead of a check like > if (tuple->t_infomask & HEAP_XMAX_INVALID) > we would have something like > if (HeapTupleHeaderXmaxIsInvalid(tuple)) > which checks both things -- and is easier to read, IMHO. Note that that bit of code isn't new. It just has moved a tiny bit. Similar tests are splattered all over tqual.c - so I have a hard time thinking this could be broken. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-06-02 12:48:01 -0400, Alvaro Herrera wrote: > > Andres Freund wrote: > > > > > I do wonder if any of the other existing callers of HTSV are affected. I > > > don't understand predicate.c well enough to be sure, but it looks to me > > > like it'd could in theory lead to missed conflicts. Seems fairly > > > unlikely to matter in practice though. > > > > One place where the difference in the new logic would cause a change is > > the tupleIsAlive bit in IndexBuildHeapScan(), when building unique > > indexes. Right now if the tuple is inserted by a remote running > > transaction, and updated by it, we return DELETE_IN_PROGRESS, so we set > > tupleIsAlive = false; so we wouldn't block if we see a tuple with that > > value elsewhere. If we instead return INSERT_IN_PROGRESS we set > > tupleIsAlive = true, and we would block until the inserter is done. > > Hm. Maybe I am missing something, but if either INSERT or > DELETE_IN_PROGRESS is returned for a unique index in > IndexBuildHeapScan() we'll wait for the xmin/xmax respectively and > recheck the tuple, right? That recheck will then return a non-ephemeral > status. Yeah, that's how I interpret that maze of little loops using callbacks. I don't think waiting on xmin is the same as waiting on xmax, however; the xmin could stay running for a while, but the xmax could be an quickly aborted subtransaction, for instance. > > > I have to say I really hate the amount of repetitive code inside the > > > individual visibility routines. Obviously it's nothing backpatchable and > > > may become moot to a certain degree with the CSN work, but those are > > > pretty close to being unmaintainable. > > > > One thing I'm now wondering while reading this code is those cases where > > XMAX_INVALID is not set, but the value in Xmax is InvalidXid. We have > > discussed these scenarios elsewhere and the conclusion seems to be that > > they are rare but possible and therefore we should cater for them. > > Is there really a situation where that combination is possible for > XMAX_INVALID? I don't see how. I vaguely remember Robert described how would this work, even though the precise details evidently escape me because I can't quite trace it now. I can't remember what thread this was on, either. > Are you possibly thinking of the case where 9.3 assumed wrongly for a > while that a set XMAX_INVALID would be sufficient for freezing? Nope, that's not it IIRC. It wasn't a bug but valid a corner case. I think it involved a crash. > > So instead of a check like > > if (tuple->t_infomask & HEAP_XMAX_INVALID) > > we would have something like > > if (HeapTupleHeaderXmaxIsInvalid(tuple)) > > which checks both things -- and is easier to read, IMHO. > > Note that that bit of code isn't new. It just has moved a tiny > bit. Similar tests are splattered all over tqual.c - so I have a hard > time thinking this could be broken. Yeah, I'm not saying this is a new bug; I was only reacting to your suggestion that we dump all of tqual.c and rewrite it from scratch (heh). Also, the corner case that produced this exact scenario is very low probability IIRC, so even if it's broken today, it might not be all that obvious that this is the cause. I agree that the visibility routines are pretty hard to follow; and evidently given the number of times we've messed up, it's hard to get all the little corner cases right. I don't know what a more maintainable version of them would look like, however, particularly keeping in mind that they are performance-critical. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-02 13:27:17 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > On 2014-06-02 12:48:01 -0400, Alvaro Herrera wrote: > > > Andres Freund wrote: > > > > > > > I do wonder if any of the other existing callers of HTSV are affected. I > > > > don't understand predicate.c well enough to be sure, but it looks to me > > > > like it'd could in theory lead to missed conflicts. Seems fairly > > > > unlikely to matter in practice though. > > > > > > One place where the difference in the new logic would cause a change is > > > the tupleIsAlive bit in IndexBuildHeapScan(), when building unique > > > indexes. Right now if the tuple is inserted by a remote running > > > transaction, and updated by it, we return DELETE_IN_PROGRESS, so we set > > > tupleIsAlive = false; so we wouldn't block if we see a tuple with that > > > value elsewhere. If we instead return INSERT_IN_PROGRESS we set > > > tupleIsAlive = true, and we would block until the inserter is done. > > > > Hm. Maybe I am missing something, but if either INSERT or > > DELETE_IN_PROGRESS is returned for a unique index in > > IndexBuildHeapScan() we'll wait for the xmin/xmax respectively and > > recheck the tuple, right? That recheck will then return a non-ephemeral > > status. > > Yeah, that's how I interpret that maze of little loops using callbacks. > I don't think waiting on xmin is the same as waiting on xmax, however; > the xmin could stay running for a while, but the xmax could be an > quickly aborted subtransaction, for instance. It essentially is. If xmax aborts you still have to deal with an 'INSERT_IN_PROGRESS' tuple because, as you say, xmin is still running. That's essentially my point. INSERT_IN_PROGRESS isn't a wrong answer. > > > One thing I'm now wondering while reading this code is those cases where > > > XMAX_INVALID is not set, but the value in Xmax is InvalidXid. We have > > > discussed these scenarios elsewhere and the conclusion seems to be that > > > they are rare but possible and therefore we should cater for them. > > > > Is there really a situation where that combination is possible for > > XMAX_INVALID? I don't see how. > > I vaguely remember Robert described how would this work, even though the > precise details evidently escape me because I can't quite trace it now. > I can't remember what thread this was on, either. Robert: Do you remember that case? Alvaro: In the end it'd not be very harmful - if it happens TransactionIdDidCommit() will return false (there's special case code for it). > Yeah, I'm not saying this is a new bug; I was only reacting to your > suggestion that we dump all of tqual.c and rewrite it from scratch > (heh). Heh. Not going to start with that today. Especially not in a patch that needs to be backpatched :) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-06-02 13:27:17 -0400, Alvaro Herrera wrote: > > Andres Freund wrote: > > > On 2014-06-02 12:48:01 -0400, Alvaro Herrera wrote: > > > > Andres Freund wrote: > > > > > > > > > I do wonder if any of the other existing callers of HTSV are affected. I > > > > > don't understand predicate.c well enough to be sure, but it looks to me > > > > > like it'd could in theory lead to missed conflicts. Seems fairly > > > > > unlikely to matter in practice though. > > > > > > > > One place where the difference in the new logic would cause a change is > > > > the tupleIsAlive bit in IndexBuildHeapScan(), when building unique > > > > indexes. Right now if the tuple is inserted by a remote running > > > > transaction, and updated by it, we return DELETE_IN_PROGRESS, so we set > > > > tupleIsAlive = false; so we wouldn't block if we see a tuple with that > > > > value elsewhere. If we instead return INSERT_IN_PROGRESS we set > > > > tupleIsAlive = true, and we would block until the inserter is done. > > > > > > Hm. Maybe I am missing something, but if either INSERT or > > > DELETE_IN_PROGRESS is returned for a unique index in > > > IndexBuildHeapScan() we'll wait for the xmin/xmax respectively and > > > recheck the tuple, right? That recheck will then return a non-ephemeral > > > status. > > > > Yeah, that's how I interpret that maze of little loops using callbacks. > > I don't think waiting on xmin is the same as waiting on xmax, however; > > the xmin could stay running for a while, but the xmax could be an > > quickly aborted subtransaction, for instance. > > It essentially is. If xmax aborts you still have to deal with an > 'INSERT_IN_PROGRESS' tuple because, as you say, xmin is still > running. That's essentially my point. INSERT_IN_PROGRESS isn't a wrong > answer. Uh. Actually it strikes me that DELETE_IN_PROGRESS is a wrong answer in the XidIsInProgress(xmin) block anyhow. If the tuple is being inserted remotely, there is no way we can return DELETE_IN_PROGRESS, is there? We can never be sure that the deleting subxact is not going to abort. The only way for this transaction to say that the tuple is DELETE_IN_PROGRESS is if the *inserting* subtransaction is an aborted subxact; but we already tell that case apart. So DELETE_IN_PROGRESS is only a valid return in the new block, XidIsCurrentXact(xmin). Another thing I wanted to comment on originally is that I'm not sure about the CHECK_FOR_INTERRUPTS() additions. Aren't those only to serve the case of the current bug, or bugs of similar ilk? There is no way that a valid, non-bug situation is going to cause us to loop forever there. That said, there isn't much harm in having them there either, but I think I'd put them before the xact wait, not after. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-02 14:39:22 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > It essentially is. If xmax aborts you still have to deal with an > > 'INSERT_IN_PROGRESS' tuple because, as you say, xmin is still > > running. That's essentially my point. INSERT_IN_PROGRESS isn't a wrong > > answer. > > Uh. Actually it strikes me that DELETE_IN_PROGRESS is a wrong answer in > the XidIsInProgress(xmin) block anyhow. If the tuple is being inserted > remotely, there is no way we can return DELETE_IN_PROGRESS, is there? I don't think it'd break stuff if we did it in the !IsCurrent() case. But yes, that's why I made the TransactionIdIsInProgress(xmin) block unconditionally return INSERT_IN_PROGRESS. > Another thing I wanted to comment on originally is that I'm not sure > about the CHECK_FOR_INTERRUPTS() additions. Aren't those only to serve > the case of the current bug, or bugs of similar ilk? There is no way > that a valid, non-bug situation is going to cause us to loop forever > there. I think there's other possible cases where it loop for a long while. Think e.g. of a row that's updated under contention by several backends or such. > That said, there isn't much harm in having them there either, > but I think I'd put them before the xact wait, not after. XactLockTableWait() internally catches waits if they happen while waiting, so that's covered. I am mainly concerned with xmax changing frequently... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jun 2, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Robert: Do you remember that case? > > Alvaro: In the end it'd not be very harmful - if it happens > TransactionIdDidCommit() will return false (there's special case code > for it). Not specifically, but I'd be surprised if it isn't possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Sandro, On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > The attached script shows a plpgsql function that enters > an infinite loop which is not interrupted by statement_timeout. Pushed a fix. Thanks for the report! Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-06-04 19:03:15 -0400, Robert Haas wrote: > On Mon, Jun 2, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > Robert: Do you remember that case? > > > > Alvaro: In the end it'd not be very harmful - if it happens > > TransactionIdDidCommit() will return false (there's special case code > > for it). > > Not specifically, but I'd be surprised if it isn't possible. Wouldn't that mean that every single visibility routine in tqual.c is buggy? I am not convinced there aren't further bugs in some corner cases in tqual.c. But even a low likelihood scenarious of xmax = InvalidTransactionId && (infomask & XMAX_INVALID) == 0 would have become visible by now given how widespread/central those tests are? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jun 4, 2014 at 7:10 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-06-04 19:03:15 -0400, Robert Haas wrote: >> On Mon, Jun 2, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > Robert: Do you remember that case? >> > >> > Alvaro: In the end it'd not be very harmful - if it happens >> > TransactionIdDidCommit() will return false (there's special case code >> > for it). >> >> Not specifically, but I'd be surprised if it isn't possible. > > Wouldn't that mean that every single visibility routine in tqual.c is > buggy? > > I am not convinced there aren't further bugs in some corner cases in > tqual.c. But even a low likelihood scenarious of > xmax = InvalidTransactionId && (infomask & XMAX_INVALID) == 0 > would have become visible by now given how widespread/central those > tests are? What specifically do you think will break? For example, in HeapTupleSatisfiesMVCC(), if no xmax-related hint bits are set, it looks to me like we'll end up here: if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple))) { /* it must have aborted or crashed */ SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId); return true; } Well, TransactionIdDidCommit() will call TransactionLogFetch() which has a special case coding to handle non-normal transaction IDs, which for InvalidTransactionId will return TRANSACTION_STATUS_ABORTED. Now maybe that is dead code and always has been, but that wouldn't be my bet going into it: I bet that got there for a reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > What specifically do you think will break? For example, in > HeapTupleSatisfiesMVCC(), if no xmax-related hint bits are set, it > looks to me like we'll end up here: > if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple))) > { > /* it must have aborted or crashed */ > SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, > InvalidTransactionId); > return true; > } > Well, TransactionIdDidCommit() will call TransactionLogFetch() which > has a special case coding to handle non-normal transaction IDs, which > for InvalidTransactionId will return TRANSACTION_STATUS_ABORTED. Now > maybe that is dead code and always has been, but that wouldn't be my > bet going into it: I bet that got there for a reason. No, the code to return that for InvalidTransactionId is just there because it has to do *something*, and returning status COMMITTED is clearly not too sane. I don't believe there was ever any expectation that xmax equal to InvalidTransactionId would not have the XMAX_INVALID hint bit set. If anyone had thought about that case, they'd probably have chosen to put in a should-not-happen error for it, because it strongly suggests you're looking at corrupted data. regards, tom lane
On 2014-06-04 19:27:54 -0400, Robert Haas wrote: > On Wed, Jun 4, 2014 at 7:10 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-06-04 19:03:15 -0400, Robert Haas wrote: > >> On Mon, Jun 2, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> > Robert: Do you remember that case? > >> > > >> > Alvaro: In the end it'd not be very harmful - if it happens > >> > TransactionIdDidCommit() will return false (there's special case code > >> > for it). > >> > >> Not specifically, but I'd be surprised if it isn't possible. > > > > Wouldn't that mean that every single visibility routine in tqual.c is > > buggy? > > > > I am not convinced there aren't further bugs in some corner cases in > > tqual.c. But even a low likelihood scenarious of > > xmax = InvalidTransactionId && (infomask & XMAX_INVALID) == 0 > > would have become visible by now given how widespread/central those > > tests are? > > What specifically do you think will break? If XMAX_INVALID may accidentally not be set even though there's no xmax, other xmax flags could be bogus as well. Afaics there's no difference in the handling between them. And then tests for LOCKED_ONLY, IS_MULTI and such would be dodgy. The only things that - afaics - are allowed to to change the actual xmax transactionid are: * freezing * heap_(update|delete|lock)_tuple And of those only freezing will unset xmax to InvalidTransactionId. But it re-sets XMAX_INVALID (and clears all other xmax flags). So I don't really see a danger. But you're right, even if there were cases where that case happens - without any other odd flags - we'd still be safe. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jun 05, 2014 at 01:04:51AM +0200, Andres Freund wrote: > Hi Sandro, > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > The attached script shows a plpgsql function that enters > > an infinite loop which is not interrupted by statement_timeout. > > Pushed a fix. Thanks for the report! Thanks, will test asap. BTW, I noticed the commit contained no testcase, would it be safer to include one ? The smallest test I got to (not requiring PostGIS) is here: http://strk.keybit.net/tmp/crash.sql --strk;
On Thu, Jun 05, 2014 at 07:58:56AM +0200, Sandro Santilli wrote: > On Thu, Jun 05, 2014 at 01:04:51AM +0200, Andres Freund wrote: > > Hi Sandro, > > > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > > The attached script shows a plpgsql function that enters > > > an infinite loop which is not interrupted by statement_timeout. > > > > Pushed a fix. Thanks for the report! > > Thanks, will test asap. I confirm the crash doesn't happen anymore, thanks again. --strk;
On 2014-06-05 07:58:56 +0200, Sandro Santilli wrote: > On Thu, Jun 05, 2014 at 01:04:51AM +0200, Andres Freund wrote: > > Hi Sandro, > > > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > > The attached script shows a plpgsql function that enters > > > an infinite loop which is not interrupted by statement_timeout. > > > > Pushed a fix. Thanks for the report! > > Thanks, will test asap. > > BTW, I noticed the commit contained no testcase, would it be safer > to include one ? I have a separate commit with a testcase prepared. We're currently discussing about some infrastructure that would make it more precise... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jun 05, 2014 at 10:35:01AM +0200, Andres Freund wrote: > On 2014-06-05 07:58:56 +0200, Sandro Santilli wrote: > > On Thu, Jun 05, 2014 at 01:04:51AM +0200, Andres Freund wrote: > > > Hi Sandro, > > > > > > On 2014-05-30 16:31:50 +0200, Sandro Santilli wrote: > > > > The attached script shows a plpgsql function that enters > > > > an infinite loop which is not interrupted by statement_timeout. > > > > > > Pushed a fix. Thanks for the report! > > > > Thanks, will test asap. > > > > BTW, I noticed the commit contained no testcase, would it be safer > > to include one ? > > I have a separate commit with a testcase prepared. We're currently > discussing about some infrastructure that would make it more precise... Great, thanks again to everyone involved in this fix! --strk;
Andres Freund <andres@2ndquadrant.com> wrote:=0A=0A> I do wonder if any of = the other existing callers of HTSV are=0A> affected. I don't understand pre= dicate.c well enough to be sure,=0A> but it looks to me like it'd could in = theory lead to missed=0A> conflicts. Seems fairly unlikely to matter in pra= ctice though.=0A=0AEarly in the evaluation of the multi-xact bugs around 9.= 3 someone=0Apointed out that the serializable transactions could be affecte= d. I=0Alooked at it then, and it was very hard to set up a case to=0Aactual= ly hit that, and there were no field reports of it, so I=0Afigured I should= let the dust settle on fixes in this area before=0Adeciding what to do.=A0= It appeared that the fixes being made for=0Aother reasons might fix this a= s well.=A0 So far I haven't gotten the=0Afeeling that this area has yet set= tled down enough to get back to=0Athis.=A0 Perhaps we've now gotten to that= point?=0A=0AHave we really pinned down the semantics of the the return val= ues=0Afor HeapTupleSatisfiesVacuum() at this point?=A0 If so, I guess the= =0Anext question is, where is the exact meaning of each return value=0Adocu= mented?=A0 The five to seven word comments in the enum value=0Adeclaration = *seem* clear on the face of it, but I don't actually=0Ahave confidence what= they really mean when, for example, a=0Asubtransaction has rolled back.=A0= It appears that=0AHEAPTUPLE_INSERT_IN_PROGRESS can be returned for a DELET= E under=0Asome circumstances, but it would take a lot of work to reverse-= =0Aengineer exactly what the various conditions that can cause that=0Aretur= n value are.=A0 It seems like someone who knows what these=0Avalues really = mean should capture that in a code comment somewhere.=0A=0AWhat the code in= predicate.c is using this for is to determine, for=0Aa given tuple which t= he calling process is reading, whether it is=0Avisible to the calling proce= ss but has been deleted by another=0Atransaction (whose work this snapshot = can't see), or is not visible=0Ato the calling process and has been inserte= d by another transaction=0A(whose work this transaction can't see).=A0 To q= ualify, the change=0Afrom the other transaction must be either committed or= still=0Apending (i.e., it might commit).=A0 With the tuple's visibility to= =0Athe current process known, what return values answer that question=0Awit= hout further checking, and what further checking is needed for=0Awhich othe= r return codes?=0A=0A--=0AKevin Grittner=0AEDB: http://www.enterprisedb.com= =0AThe Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@ymail.com> writes: > Have we really pinned down the semantics of the the return values > for HeapTupleSatisfiesVacuum() at this point? Well, now that Andres unilaterally redefined HEAPTUPLE_INSERT_IN_PROGRESS as meaning whatever the heck was easiest, I'd say no ;-). But we have to think about what we do want from it. I'm not sure that we can positively guarantee to distinguish all the possible states of an update-in-progress tuple from outside the updating backend, and it likely would not be worth the trouble to try since the answer could change at any instant anyway. For the statistical purposes I was on about in the other thread, I would be satisfied if we could distinguish "will be live if all relevant transactions commit" from "will be dead if all relevant transactions commit". For other combinations like will-be-live-if-deleting- subtransaction-rolls-back-but-outer-transaction-commits, there's really no way for VACUUM to decide whether to count the tuple as live anyway. Yet it'd be nice if we produced a live-tuple-count estimate that had some relationship to reality in *some* scenario, and everything-commits seems like the best choice of assumption. It sounds like SSI might have some different requirements --- can you specify what you need? > What the code in predicate.c is using this for is to determine, for > a given tuple which the calling process is reading, whether it is > visible to the calling process but has been deleted by another > transaction (whose work this snapshot can't see), or is not visible > to the calling process and has been inserted by another transaction > (whose work this transaction can't see). I'm not sure why you'd need HeapTupleSatisfiesVacuum for that at all --- it sounds like a plain visible-according-to-query-snapshot test. Certainly HTSV is unconcerned with visibility as such, so it can't satisfy this requirement by itself. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:=0A> Kevin Grittner <kgrittn@ymail.com> = writes:=0A>> Have we really pinned down the semantics of the the return val= ues=0A>> for HeapTupleSatisfiesVacuum() at this point?=0A>=0A> Well, now th= at Andres unilaterally redefined HEAPTUPLE_INSERT_IN_PROGRESS=0A> as meanin= g whatever the heck was easiest, I'd say no ;-).=0A=0AThat's kinda what I w= as afraid of.=0A=0A> But we have to=0A> think about what we do want from it= .=A0 I'm not sure that we can positively=0A> guarantee to distinguish all t= he possible states of an update-in-progress=0A> tuple from outside the upda= ting backend, and it likely would not be worth=0A> the trouble to try since= the answer could change at any instant anyway.=0A=0AThat sounds like way m= ore than SSI needs anyway.=0A=0A> For the statistical purposes I was on abo= ut in the other thread, I would=0A> be satisfied if we could distinguish "w= ill be live if all relevant=0A> transactions commit" from "will be dead if = all relevant transactions=0A> commit".=0A=0AIf I'm understanding you, and i= f HEAPTUPLE_LIVE and HEAPTUPLE_DEAD=0Aare clear-cut "nothing is in progress= " cases, I think that might be=0Aenough.=A0 We know coming in whether the t= uple is visible to our own=0Atransaction; the question is whether its exist= ence has been or is=0Apotentially being changed by a top level transaction = whose work we=0Acan't see.=A0 Ideally we would ignore the effects of a subt= ransaction=0Aunderneath such a top level transaction if that subtransaction= has=0Arolled back, but it would only compromise efficiency (not=0Acorrectn= ess) if we didn't ignore the work of subtransactions which=0Ahave already b= een rolled back on an uncommitted top-level transaction.=0A=0A>> What the c= ode in predicate.c is using this for is to determine, for=0A>> a given tupl= e which the calling process is reading, whether it is=0A>> visible to the c= alling process but has been deleted by another=0A>> transaction (whose work= this snapshot can't see), or is not visible=0A>> to the calling process an= d has been inserted by another transaction=0A>> (whose work this transactio= n can't see).=0A>=0A> I'm not sure why you'd need HeapTupleSatisfiesVacuum = for that at all=0A> --- it sounds like a plain visible-according-to-query-s= napshot test.=0A=0AThat is *one* of the things we need, and it is passed in= as a bool=0Aparameter.=A0 We also need to know whether another transaction= has=0Acommitted, or might commit, something which would delete a visible= =0Atuple or insert a tuple which is not visible to us (but which has=0Abeen= read on a scan).=A0 In other words, would the results of this=0Ascan be di= fferent if run again after commit of the other=0Atransaction?=A0 We're talk= ing about the=0ACheckForSerializableConflictOut() function, in case my desc= ription=0Ahere isn't clear; perhaps the function comments and/or code will= =0Aclarify things.=0A=0Ahttp://git.postgresql.org/gitweb/?p=3Dpostgresql.gi= t;a=3Dblob;f=3Dsrc/backend/storage/lmgr/predicate.c;h=3D7c8d53e6a5a44e8bbbd= a453c730bb5073f8a3842;hb=3Dmaster#l3850=0A=0A> Certainly HTSV is unconcerne= d with visibility as such, so it can't=0A> satisfy this requirement by itse= lf.=0A=0ARight.=0A=0A--=0AKevin Grittner=0AEDB: http://www.enterprisedb.com= =0AThe Enterprise PostgreSQL Company
On 2014-06-07 15:20:07 -0700, Kevin Grittner wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kevin Grittner <kgrittn@ymail.com> writes: > >> Have we really pinned down the semantics of the the return values > >> for HeapTupleSatisfiesVacuum() at this point? > > > > Well, now that Andres unilaterally redefined HEAPTUPLE_INSERT_IN_PROGRESS > > as meaning whatever the heck was easiest, I'd say no ;-). Meh. It's not about defining it what was easiest. The issue is that there's simply no clear definition of what's correct to return when both xmin and xmax are in progress. You can argue that HEAPTUPLE_INSERT_IN_PROGRESS is correct because it's not clear that the row has ever been inserted. You can argue that DELETE_IN_PROGRESS is more correct since it has actually been deleted. The latter has the problem that callers might reasonably expect xmin to actually have committed. Note that the previous definition was clearly more bogus (which was actively causing bugs!): When xmax was set but aborted it still returned DELETE_IN_PROGRESS. > > But we have to > > think about what we do want from it. I'm not sure that we can positively > > guarantee to distinguish all the possible states of an update-in-progress > > tuple from outside the updating backend, and it likely would not be worth > > the trouble to try since the answer could change at any instant anyway. > > That sounds like way more than SSI needs anyway. Note that the current code *does* give an accurate answer in that case. The only behavioural change is when xmin and xmax are *both* in progress and are *not* running in the current backend. Before my change it was: else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple))) { if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */ return HEAPTUPLE_INSERT_IN_PROGRESS; /* only locked? run infomask-only check first, for performance */ if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) || HeapTupleHeaderIsOnlyLocked(tuple)) return HEAPTUPLE_INSERT_IN_PROGRESS; /* inserted and then deleted by same xact */ return HEAPTUPLE_DELETE_IN_PROGRESS; } now it's else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple))) { if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */ return HEAPTUPLE_INSERT_IN_PROGRESS; /* only locked? run infomask-only check first, for performance */ if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) || HeapTupleHeaderIsOnlyLocked(tuple)) return HEAPTUPLE_INSERT_IN_PROGRESS; /* inserted and then deleted by same xact */ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple))) return HEAPTUPLE_DELETE_IN_PROGRESS; /* deleting subtransaction must have aborted */ return HEAPTUPLE_INSERT_IN_PROGRESS; } else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple))) { /* * It'd be possible to discern between INSERT/DELETE in progress * here by looking at xmax - but that doesn't seem beneficial for * the majority of callers and even detrimental for some. We'd * rather have callers look at/wait for xmin than xmax. It's * always correct to return INSERT_IN_PROGRESS because that's * what's happening from the view of other backends. */ return HEAPTUPLE_INSERT_IN_PROGRESS; } Disregarding all that verbiage, the old behaviour was: If xmin is in progress in this or another backend and xmax was ever set: HEAPTUPLE_DELETE_IN_PROGRESS now it's If xmin is in progress and it's running in the current backend: return INSERT/DELETE_IN_PROGRESS according to xmax otherwise if xmin is in progress INSERT_IN_PROGRESS All the rest of the cases are unchanged. > > For the statistical purposes I was on about in the other thread, I would > > be satisfied if we could distinguish "will be live if all relevant > > transactions commit" from "will be dead if all relevant transactions > > commit". > > If I'm understanding you, and if HEAPTUPLE_LIVE and HEAPTUPLE_DEAD > are clear-cut "nothing is in progress" cases, I think that might be > enough. Ok. Those haven't changed. > We know coming in whether the tuple is visible to our own > transaction; So I understand it correctly that that predicate.c will never be invoked HeapTupleSatisfiesVacuum() when the tuple has been created (INSERT, new UPDATE version) by a currently running backend? Does that include the case where the looked at tuple has been created by the current backend? > the question is whether its existence has been or is > potentially being changed by a top level transaction whose work we > can't see. I.e. this is effectively only looking for changes of xmax? Because otherwise we'd not be looking at the tuple? > Ideally we would ignore the effects of a subtransaction > underneath such a top level transaction if that subtransaction has > rolled back, but it would only compromise efficiency (not > correctness) if we didn't ignore the work of subtransactions which > have already been rolled back on an uncommitted top-level transaction. That's the case. > >> What the code in predicate.c is using this for is to determine, for > >> a given tuple which the calling process is reading, whether it is > >> visible to the calling process but has been deleted by another > >> transaction (whose work this snapshot can't see), or is not visible > >> to the calling process and has been inserted by another transaction > >> (whose work this transaction can't see). > > > > I'm not sure why you'd need HeapTupleSatisfiesVacuum for that at all > > --- it sounds like a plain visible-according-to-query-snapshot test. > > That is *one* of the things we need, and it is passed in as a bool > parameter. We also need to know whether another transaction has > committed, or might commit, something which would delete a visible > tuple or insert a tuple which is not visible to us (but which has > been read on a scan). In other words, would the results of this > scan be different if run again after commit of the other > transaction? We're talking about the > CheckForSerializableConflictOut() function, in case my description > here isn't clear; perhaps the function comments and/or code will > clarify things. I'm a bit confused by this - now you're talking about it touching tuples which aren't visible to the current backend? The relevant snippet are those cases here: case HEAPTUPLE_DELETE_IN_PROGRESS: xid = HeapTupleHeaderGetUpdateXid(tuple->t_data); break; case HEAPTUPLE_INSERT_IN_PROGRESS: xid = HeapTupleHeaderGetXmin(tuple->t_data); break; Does it I) cause a bug if you're looking at xmax because HEAPTUPLE_DELETE_IN_PROGRESS was returned even though xmin hasn't even committed? That's the case 621a99a fixed. It looks like not because a) you're chasing the xid to the toplevel xid anyway b) you're ignoring the case where it was created by our own backend anyway. Does it II) cause a bug if you're looking at xmin because HEAPTUPLE_INSERT_IN_PROGRESS was returned even though xmax was set *iff* xmin is also in progress by the same backend? It doesn't look so either for I)'s a) reason. I.e. you'd in the end look at the same xid anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services