On Mon, 2010-04-19 at 10:36 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Log Message:
> > -----------
> > Tune GetSnapshotData() during Hot Standby by avoiding loop
> > through normal backends. Makes code clearer also, since we
> > avoid various Assert()s. Performance of snapshots taken
> > during recovery no longer depends upon number of read-only
> > backends.
>
> I think there's a little race condition there.
> snapshot->takenDuringRecovery is set before acquiring ProcArrayLock, so
> it's possible that by the time we acquire the lock, we're no longer in
> recovery. So this can happen:
>
> 1. Backend A starts to take a snapshot, while we're still in recovery.
> takenDuringRecovery is assigned true.
> 2. Recovery ends, and a normal transaction X begins in backend B.
> 3. A skips scanning ProcArray because takenDuringRecovery is true.
>
> The snapshot doesn't include X, so any changes done in that transaction
> will not be visible to the snapshot while the transaction is still
> running, but will be after it commits.
>
> Of course, it's highly improbable for 2. to happen, but it's there.
The order of events is as you say, though I don't see the problem. The
new xids will be beyond xmax and would be filtered out even if we did
scan the procs, so they will be treated as running, which they are. Xmax
will not have advanced since that relies on completed transactions, not
started ones.
-- Simon Riggs www.2ndQuadrant.com