On Tue, Aug 10, 2004 at 12:24:06PM -0400, Tom Lane wrote:
> It may be that we do not care because pg_subtrans doesn't have to be
> valid after a crash, but I haven't seen any proof of that theory.
Let's suppose we crash between creating a child transaction and marking
it as done. What we have to ensure after a crash is that if we marked
the parent as committed, the child has to be marked committed too. The
WAL record carries this information already, and on recovery, the child
will be marked COMMIT.
The whole point of the subtrans info is to be available _while_ the
transaction tree is running. If there is a crash, then by definition no
backend can be running when we return, so pg_subtrans info is useless at
that point. We only need pg_clog to be correct.
> And if that theory is correct, then it is a seriously bad design to be
> using the same code infrastructure for both pg_clog and pg_subtrans.
> Every fsync on pg_subtrans is wasted effort if that is going to be our
> approach.
Right, but AFAICS both pg_clog and pg_subtrans are only fsync'ed during
checkpoint and shutdown, so it doesn't seem that costly. We could
certainly skip calling CheckPointSUBTRANS() or making it a noop ...
> We should in fact just delete pg_subtrans and re-init it to zeroes
> during postmaster start...
Is it worth the duplicated code? It won't be consulted anyway for
pre-crash Xids, because TransactionIdIsInProgress will return early by
means of RecentGlobalXmin.
On a related note: if we mark a Xid with SUBTRANS COMMIT and later crash
without updating it, the main Xid will remain in in-progress status. At
what point is it marked aborted? I can see such a status change only in
XactLockTableWait. This may be important because we will change the
subtransaction to aborted state only if we see the parent in aborted
state too.
--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)