Robert Haas wrote:
> I had some review comments
> I was hoping to get responses to, in the section beginning with "A few
> other comments based on a preliminary reading of this patch":
>
> http://archives.postgresql.org/pgsql-hackers/2009-07/msg00854.php
Having read the patch now, here's a one issue in addition to the remarks
you made in mail linked above, and all the things already marked with
XXX comments:
I think there's a race condition in the way LogCurrentRunningXacts() is
called at the end of checkpoint. This can happen in the master:
1. Checkpoint starts
2. Transaction 123 begins, and does some updates
3. Checkpoint ends. LogCurrentRunningXacts() is called.
4. LogCurrentRunningXacts() gets the list of currently running
transactions by calling GetCurrentTransactionData().
5. Transaction 123 ends, writing commit record to WAL
6. LogCurrentRunningXacts() writes the list of running XIDs to WAL. This
includes XID 123, since that was still running at step 4.
When that is replayed, ProcArrayUpdateTransactions() will zap the
unobserved xids array with the list that includes XID 123, even though
we already saw a commit record for it.
I removed some Recovery Proc related crud that was still in the patch
but unused. Merge from the "hs" branch at
git://git.postgresql.org/git/users/heikki/postgres.git to get that change.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com