On Tue, 2009-02-24 at 22:29 +0200, Heikki Linnakangas wrote:
> Oh, right... But we have the same problem with the subtransactions,
> don't we? This block:
>
> > /*
> > * If our state information is later for this proc, then
> > * overwrite it. It's possible for a commit and possibly
> > * a new transaction record to have arrived in WAL in between
> > * us doing GetRunningTransactionData() and grabbing the
> > * WALInsertLock, so we musn't assume we always know best.
> > */
> > if (XLByteLT(proc->lsn, lsn))
> > {
> > TransactionId *subxip = (TransactionId *) &(xlrec->xrun[xlrec->xcnt]);
> >
> > proc->lsn = lsn;
> > /* proc-> pid stays 0 for Recovery Procs */
> >
> > proc->subxids.nxids = rxact[xid_index].nsubxids;
> > proc->subxids.overflowed = rxact[xid_index].overflowed;
> >
> > memcpy(proc->subxids.xids, subxip,
> > rxact[xid_index].nsubxids * sizeof(TransactionId));
> >
> > /* Remove subtransactions from UnobservedXids also */
> > if (unobserved)
> > {
> > for (index = 0; index < rxact[xid_index].nsubxids; index++)
> > UnobservedTransactionsRemoveXid(subxip[index + rxact[xid_index].subx_offset], false);
> > }
> > }
>
> overwrites subxids array, and will resurrect any already aborted
> subtransaction.
>
> Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of
> the WAL record we're redoing, so there can't be any procs with an LSN
> higher than that?
I'm wondering whether we need those circumstances at all.
The main role of ProcArrayUpdateRecoveryTransactions() is two-fold
* initialise snapshot when there isn't one
* reduce possibility of FATAL errors that don't write abort records
Neither of those needs us to update the subxid cache, so we'd be better
off avoiding that altogether in the common case. So we should be able to
ignore the lsn and race conditions altogether.
It might even be more helpful to explicitly separate those twin roles so
the code is clearer.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support