Tom Lane <tgl@sss.pgh.pa.us> writes:
> Barring objections, I'm off to program this.
A few concerns
a) The use of ShareUpdateExclusiveLock is supposed to lock out concurrent vacuums. I just tried it and vacuum seemed
tobe unaffected. I'm going to retry it with a clean cvs checkout to be sure it isn't something in my local tree
that'sbroken.
Do we still need to block concurrent vacuums if we're using snapshots? Obviously we have to block them during phase
1because it won't have a chance of removing the tuples from our private collection of index tuples that haven't been
pushedlive yet. But if phase 2 is ignoring tuples too new to be visible in its snapshot then it shouldn't care if dead
tuplesare deleted even if those slots are later reused.
b) You introduced a LockRelationIdForSession() call (I even didn't realize we had this capability when I wrote the
patch).Does this introduce the possibility of a deadlock though? If one of the transactions we're waiting to finish
hasa shared lock on the relation and is waiting for an exclusive lock on the relation then it seems we'll wait forever
forit to finish and never see either of our conditions for continuing. That would be fine except because we're
waitingmanually the deadlock detection code doesn't have a chance of firing.
To solve that we would have to replace the pg_sleep call with a XactLockTableWait. But I'm not clear how to find a
transactionid to wait on. What we would want to find is any transaction id that has an xmin older than our xmin. Even
thatisn't ideal since it wouldn't give us a chance to test our other out so if we choose a transaction to wait on that
doesn't hold even a share lock on our table we could end up stuck longer than necessary (ie when we would have been
ableto momentarily acquire the exclusive lock on the table earlier).
c) It's a shame we don't support multiple concurrent concurrent index builds. We could create a ShareUpdateShareLock
thatconflicts with the same list of locks that ShareUpdateExclusiveLock conflicts with but not itself. From a UI
pointof view there's no excuse for not doing this, but from an implementation point of view there's a limit of 10 lock
typesand this would get up to 9. This is my first time looking at this code so I'm not sure how hard that limit is.
One caveat is that the two jobs would see each other and that would make it hard for them to proceed to phase 2. I
thinkwhat would happen is that the first one to finish phase 1 would be able to continue as soon as the other
finishesphase 1. The second one would have to wait until the first one's phase 2 finished.
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com