Re: Hot standby, recovery procs - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Hot standby, recovery procs
Date
Msg-id 1235518895.16176.285.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Hot standby, recovery procs  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Hot standby, recovery procs
List pgsql-hackers
On Tue, 2009-02-24 at 22:29 +0200, Heikki Linnakangas wrote:

> Oh, right... But we have the same problem with the subtransactions, 
> don't we? This block:
> 
> >         /*
> >          * If our state information is later for this proc, then 
> >          * overwrite it. It's possible for a commit and possibly
> >          * a new transaction record to have arrived in WAL in between
> >          * us doing GetRunningTransactionData() and grabbing the
> >          * WALInsertLock, so we musn't assume we always know best.
> >          */
> >         if (XLByteLT(proc->lsn, lsn))
> >         {
> >             TransactionId     *subxip = (TransactionId *) &(xlrec->xrun[xlrec->xcnt]);
> > 
> >             proc->lsn = lsn;
> >             /* proc-> pid stays 0 for Recovery Procs */
> > 
> >             proc->subxids.nxids = rxact[xid_index].nsubxids;
> >             proc->subxids.overflowed = rxact[xid_index].overflowed;
> > 
> >             memcpy(proc->subxids.xids, subxip, 
> >                         rxact[xid_index].nsubxids * sizeof(TransactionId));
> > 
> >             /* Remove subtransactions from UnobservedXids also */
> >             if (unobserved)
> >             {
> >                 for (index = 0; index < rxact[xid_index].nsubxids; index++)
> >                     UnobservedTransactionsRemoveXid(subxip[index + rxact[xid_index].subx_offset], false);
> >             }
> >         }
> 
> overwrites subxids array, and will resurrect any already aborted 
> subtransaction.
> 
> Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of 
> the WAL record we're redoing, so there can't be any procs with an LSN 
> higher than that?

I'm wondering whether we need those circumstances at all.

The main role of ProcArrayUpdateRecoveryTransactions() is two-fold
* initialise snapshot when there isn't one
* reduce possibility of FATAL errors that don't write abort records

Neither of those needs us to update the subxid cache, so we'd be better
off avoiding that altogether in the common case. So we should be able to
ignore the lsn and race conditions altogether.

It might even be more helpful to explicitly separate those twin roles so
the code is clearer.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Synchronous replication & Hot standby patches
Next
From: Frank Featherlight
Date:
Subject: Re: Service not starting: Error 1053