Re: hot standby - merged up to CVS HEAD - Mailing list pgsql-hackers

From David Fetter
Subject Re: hot standby - merged up to CVS HEAD
Date
Msg-id 20090827191550.GA3886@fetter.org
Whole thread Raw
In response to Re: hot standby - merged up to CVS HEAD  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Thu, Aug 27, 2009 at 07:08:28PM +0100, Simon Riggs wrote:
> 
> On Mon, 2009-08-17 at 11:19 +0300, Heikki Linnakangas wrote:
> 
> > I think there's a race condition in the way LogCurrentRunningXacts() is
> > called at the end of checkpoint. This can happen in the master:
> > 
> > 1. Checkpoint starts
> > 2. Transaction 123 begins, and does some updates
> > 3. Checkpoint ends. LogCurrentRunningXacts() is called.
> > 4. LogCurrentRunningXacts() gets the list of currently running
> > transactions by calling GetCurrentTransactionData().
> > 5. Transaction 123 ends, writing commit record to WAL
> > 6. LogCurrentRunningXacts() writes the list of running XIDs to WAL. This
> > includes XID 123, since that was still running at step 4.
> > 
> > When that is replayed, ProcArrayUpdateTransactions() will zap the
> > unobserved xids array with the list that includes XID 123, even though
> > we already saw a commit record for it.
> 
> That's not a race condition, but it does make the code more complex. The
> issue has been long understood.
> 
> I don't think it's acceptable to take and hold both ProcArray and
> WALInsertLock. Those are now the two most heavily contended locks on the
> system. We have evidence that there are burst delays associated with
> various operations on just one of those locks, let alone two.
> 
> If you're still doubtful, the problem I've been working on recently is
> the point that I overlooked the initial state of the lock table in my
> earlier patch. GetRunningTransactionData() also needs to have initial
> lock data.
> 
> There is no way in hell that I could personally condone holding
> ProcArrayLock, WALInsertLock and all of the LockMgrLock partitions at
> same time. So we just have to eat the complexity. (No doubt someone will
> disagree with my strong language here, but please take it as an
> indication of exactly how bad an idea holding multiple locks will be).
> 
> Slight timing issues are not too bad really. We just have to be careful
> to assume that there is a mismatch in the data and must have code to
> handle that.
> 
> Anyway, I've been working on this problem for some time and continue to
> do so.

Great!  Where's the git repository?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: 8.5 release timetable, again
Next
From: Jaime Casanova
Date:
Subject: Re: MySQL Compatibility WAS: 8.5 release timetable, again