Home > mailing lists

Re: Cleaning up recovery from subtransaction start failure - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Cleaning up recovery from subtransaction start failure
Date	September 14, 2004 01:40:57
Msg-id	27813.1095122446@sss.pgh.pa.us Whole thread Raw
In response to	Cleaning up recovery from subtransaction start failure (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Cleaning up recovery from subtransaction start failure
List	pgsql-hackers

Tree view

I wrote:
> I don't actually like StartAbortedSubTransaction at all --- ISTM that if
> you get a failure trying to enter a subxact, it's better *not* to enter
> the subxact and instead to treat the error as putting the calling xact
> in abort state.

Actually, the more I think about this the more I like the idea I just
mentioned in the other thread, which is to not assign a subxact an XID
right away. The advantage is that acquiring an XID and a lock on it
are definitely operations that could fail (cf original problem), and
it becomes very painful to ensure that xact.c's internal state stays
legal in event of a failure there, if we insist that that happens during
subxact start.

I've just looked through the modules that use GetCurrentTransactionId
to label their internal state, and without exception they'd be perfectly
happy with an ersatz substitute that only distinguishes among
subtransactions of the current top transaction. So here is what I am
thinking:

* New datatype SubTransactionId == uint32 (just for clarity of code)

* Special values InvalidSubTransactionId == 0 and TopSubTransactionId == 1; we initialize the counter for each main
transactionso that subxacts get IDs beginning at 2.

* Subtransaction start looks about like this:
subxactid = ++SubTransactionIdCounter;if (subxactid == InvalidSubTransactionId){ // oops, wrapped around
--SubTransactionIdCounter; elog(ERROR, "out of subxact xids");}

subxactstate = palloc(sizeof(TransactionStateData));
// initialize subxact block and link it into state stack
// call modules that need start-of-subxact calls

Note that if we get an out-of-memory failure in the palloc, we have not
done anything except waste a SubTransactionId value, so there is no
recovery problem there. Once the palloc succeeds, no error is possible
between there and having a fully-set-up subxact. Errors in the "call
modules" phase, if any, can be treated as errors occuring inside an
already-established subxact. Or we could choose to abort and pop the
subxact, if you like the theory that failure during SAVEPOINT should
not create a broken subxact; this isn't too hard since we know we have
gotten to a valid stack state that abort can cope with, which was
definitely not guaranteed before.

The places that currently call GetCurrentTransactionId for labeling
internal state would all call a new GetCurrentSubTransactionId routine.
Other than that and renaming/retyping their state fields, they'd not
need much change in most cases.

The remaining calls of GetCurrentTransactionId would mostly be the ones
in heapam.c that are labeling tuples about to be written to disk.

One minor point is that GetCurrentTransactionId would now become a
routine with a nonzero prospect of failure. We'd want to be sure that
it is not called inside critical sections (since ERROR -> PANIC inside
such sections), unless we are certain a current XID has been assigned.
This would require only trivial code rearrangements in heapam.c, but
there is a call in XLogInsert that is a bigger hazard. I believe all
calls of XLogInsert are *already* inside critical sections, and so if
any of them are in code paths where no XID has been assigned, we have
a hard-to-spot risk of system panic. But there probably isn't any good
reason to be emitting an XLOG record without having assigned an XID.
I think I would fix this by not having XLogInsert do its own call,
but requiring callers to pass in the XID to use. It will be easy to
make the callers do GetCurrentTransactionId before they enter their
critical sections.

Bottom line: I'm now leaning towards doing this for 8.0, since it would
make for a useful improvement in robustness, quite aside from any
performance gains from eliminating unnecessary XID assignments.

Comments?
regards, tom lane

pgsql-hackers by date:

From: Alvaro Herrera
Date: 14 September 2004, 01:20:36
Subject: Re: pg_locks view and user locks

From: "Greg Sabino Mullane"
Date: 14 September 2004, 02:14:31
Subject: libpq and prepared statements progress for 8.0

Re: Cleaning up recovery from subtransaction start failure - Mailing list pgsql-hackers

Previous

Next