Strangeness in xid allocation / snapshot setup - Mailing list pgsql-hackers

From Tom Lane
Subject Strangeness in xid allocation / snapshot setup
Date
Msg-id 3428.994883580@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
Hi Vadim,

I am trying to understand why GetSnapshotData() needs to acquire the
SInval spinlock before it calls ReadNewTransactionId, rather than after.
I see that you made it do so --- in the commit at
http://www.ca.postgresql.org/cgi/cvsweb.cgi/pgsql/src/backend/storage/ipc/shmem.c.diff?r1=1.41&r2=1.42
but I don't understand why the loss of concurrency is "necessary".
Since we are going to treat all xids >= xmax as in-progress anyway,
what's wrong with reading xmax before we acquire the SInval lock?

Also, it seems to me that in GetNewTransactionId(), it's important
for MyProc->xid to be set before releasing XidGenLockId, not after.
Otherwise there is a narrow window for failure:

1. Process A calls GetNewTransactionId.  It allocates an xid of, say,
1234, and increments nextXid to 1235.  Just after releasing the
XidGenLock spinlock, but before it can set its MyProc->xid, control
swaps away from it.

2. Process B gets to run.  It runs GetSnapshotData.  It sees nextXid =
1235, and it does not see xid = 1234 in any backend's proc->xid.
Therefore, B will assume xid 1234 has already terminated, when it
hasn't.

Isn't this broken?  The problem would be avoided if GetNewTransactionId
sets MyProc->xid before releasing the spinlock, since then after
GetSnapshotData has called ReadNewTransactionId, we know that all older
XIDs that are still active are recorded in proc structures.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: I can't keep up
Next
From: Bruce Momjian
Date:
Subject: Re: I can't keep up