Hi,
I am reviewing this one year old change again before backporting it to 9.1.3 for
production use.
ATM, I believe the code is correct, but I don't want to miss the change to spot
possible errors, so please let me dump my brain on some points:
- IIUC, SIGetDataEntries() can return 0 when in fact there _are_ messages because stateP->hasMessages could come from
astale cache (iow there is no read-membar used and because we return before acquiring SInvalReadLock (which the patch
isall about in the first place), we don't get an implicit read-membar from a lock op any more).
What I can't judge on: Would this cause any harm? What are the consequences of SIGetDataEntries returning 0 after
anotherprocess has posted a message (looking at global temporal ordering)?
I don't quite understand the significance of the respective comment in the code that the incoherence should be
acceptablebecause the cached read can't migrate to before a previous lock acquisition (which itself is clear).
AcceptInvalidationMessages has a comment that it should be the first thing to do in a transaction, and I am not sure
ifall the consumers have a read-membar equivalent operation in place.
How bad would a missed cache invalidation be? Should we have a read-membar in SIGetDataEntries just to be safe?
Other notes on points which appear correct to me (really more a note to myself):
- stateP->hasMessages = false in SIGetDataEntries is membar'ed by SpinLockAcquire(&vsegP->msgnumLock), so it shouldn't
happenthat clearing hasMessages moves behind reading msgnumLock
(in which case we could loose the hasMessages flag)
- but it can happen that hasMessages gets set when in fact there is nothing to read (which is fine because we then
checkmaxMsgNum)
Nils