Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365) - Mailing list pgsql-bugs

From Alvaro Herrera
Subject Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Date
Msg-id 1317070986-sup-4844@alvh.no-ip.org
Whole thread Raw
In response to Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Excerpts from Tom Lane's message of lun sep 26 13:26:37 -0300 2011:
>
> yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi) writes:
> >> Maybe, but I'd still like to see a test case, because I can't reproduce
> >> any such problem by preparing ROLLBACK in an aborted transaction.
>
> > reading GetTransactionSnapshot, it seems that the problem happens
> > only with IsolationUsesXactSnapshot() true.
>
> Hmm.  I'm inclined to think that this demonstrates a bug in snapshot
> management, not so much in plancache.  We have plancache doing
>
>         PushActiveSnapshot(GetTransactionSnapshot());
>
> and then later
>
>         PopActiveSnapshot();
>
> and at this point surely it is not plancache's fault if there is any
> remaining refcount for the snapshot.  There is, though, because
> GetTransactionSnapshot saved a refcount in TopTransactionResourceOwner.
> I think it's snapmgr.c's responsibility to make sure that that's cleaned
> up, and it's not doing so.

Agreed.

> The place where that refcount normally gets dropped is
> AtEarlyCommit_Snapshot, but that isn't going to be called at all in
> aborted-transaction cleanup.  Worse, if we just transposed it over to be
> called in a place in AbortTransaction comparable to where it's called
> during commit, that still wouldn't fix the problem, because when the
> ROLLBACK happens, we've already aborted the transaction.

... ouch.

> I think that AtEarlyCommit_Snapshot is misdesigned, and that far from
> being done "early" in commit/abort, it needs to be done "late", like
> somewhere not very long before the
> ResourceOwnerDelete(TopTransactionResourceOwner) calls.  There is no
> very good reason to think that someone might not ask for a snapshot
> during commit processing.
>
> Alvaro, do you happen to remember why this got designed as an "early"
> transaction shutdown action, rather than delaying it as long as
> possible?

As far as I remember, the only principle was that it had to run before
ResourceOwner cleanup.  Commit 7b640b0345dc4fbd39ff568700985b432f6afa07
introduces that "early" call; ResOwner support had been introduced 10
days before in 6bbef4e5383c99d93aa974e2c79d328cfbd1c4a9.  I probably
just tried it out and noticed that resowner.c complained if I didn't
drop the refcount prior to its own cleanup.

I don't think I ever considered the scenario of calls in aborted
transactions.

Shall I work on a fix?  I expect you are plenty busy with commitfest
stuff, but please let me know otherwise.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #6226: Broken foreign key stored on database (parent deleted with children still readable, BUG#6225 Update)
Next
From: Tom Lane
Date:
Subject: Re: BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)