On Tue, 2009-02-24 at 23:41 +0000, Simon Riggs wrote:
> On Tue, 2009-02-24 at 22:29 +0200, Heikki Linnakangas wrote:
> > overwrites subxids array, and will resurrect any already aborted
> > subtransaction.
> >
> > Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of
> > the WAL record we're redoing, so there can't be any procs with an LSN
> > higher than that?
>
> I'm wondering whether we need those circumstances at all.
>
> The main role of ProcArrayUpdateRecoveryTransactions() is two-fold
> * initialise snapshot when there isn't one
> * reduce possibility of FATAL errors that don't write abort records
>
> Neither of those needs us to update the subxid cache, so we'd be better
> off avoiding that altogether in the common case. So we should be able to
> ignore the lsn and race conditions altogether.
We still have a race condition for the initial snapshot, so your concern
still holds. Thanks for highlighting it.
I'm in the middle of rewriting ProcArrayUpdateRecoveryTransactions() to
avoid errors caused by these race conditions. The LSN flag was an
attempt to do that, but was insufficient and has now been removed.
I'll discuss it more when I've got it working. Seems like we need
working code now rather than lengthy debates. I see a solution and
almost have it done.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support