Simon Riggs wrote:
> On Thu, 2009-02-12 at 14:23 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote:
>>>> So far so good, but what about all the other callers of
>>>> SubTransGetParent()? For example, XactLockTableWait will fail an
>>>> assertion if asked to wait on a subtransaction which is then released.
>>> I agree that it could fail the assertion, though it is clear that the
>>> assertion should now be removed.
>> No, then you just get an infinite loop instead, trying to get the parent
>> of 0 over and over again.
>
> There is no infinite loop. Try it, or read TransactionIdIsInProgress().
I did, my CPU was pegged at 100%. Hmm, attaching with a debugger shows
that it's not looping within XactLockTableWait as I assumed. Instead,
XactLockTableWait returns without waiting on the parent, so we get into
an busy loop in _bt_do_insert, trying to wait on the transaction over
and over again.
>>> The logic is: if there is no lock table entry for that xid *and* it is
>>> not in progress *and* it is not in pg_subtrans, then it must have been
>>> an aborted subtransaction of a currently active xact or it has otherwise
>>> completed.
>> Right, we got it right that far. But after the subtransaction has
>> completed, the question is: what's its parent? That's what the patch got
>> wrong.
>
> We can find that out from procarray, since a subcommitted xid will still
> be present in the subxid cache of its parent (by definition, otherwise
> it will be marked in pg_subtrans).
Unless the top transaction just committed. Looking at the other callers
of SubTransGetParent, I think it would introduce a race condition to
TransactionIdDidAbort and TransactionIdDidCommit.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com