Re: sinval synchronization considered harmful - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: sinval synchronization considered harmful |
Date | |
Msg-id | 20110727165531.GD18910@tornado.leadboat.com Whole thread Raw |
In response to | Re: sinval synchronization considered harmful (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: sinval synchronization considered harmful
|
List | pgsql-hackers |
On Tue, Jul 26, 2011 at 09:57:10PM -0400, Robert Haas wrote: > On Tue, Jul 26, 2011 at 4:38 PM, Noah Misch <noah@2ndquadrant.com> wrote: > > No new ideas come to mind, here. > > OK, I have a new idea. :-) > > 1. Add a new flag to each procState called something like "timeToPayAttention". > 2. Each call to SIGetDataEntries() iterates over all the ProcStates > whose index is < lastBackend and sets stateP->timeToPayAttention = > TRUE for each. > 3. At the beginning of SIGetDataEntries(), we do an unlocked if > (!stateP->timeToPayAttention) return 0. > 4. Immediately following that if statement and before acquiring any > locks, we set stateP->timeToPayAttention = FALSE. > > The LWLockRelease() in SIGetDataEntries() will be sufficient to force > all of the stateP->timeToPayAttention writes out to main memory, and > the read side is OK because we either just took a lock (which acted as > a fence) or else there's a race anyway. But unlike my previous > proposal, it doesn't involve *comparing* anything. We needn't worry > about whether we could read two different values that are through > great misfortune out of sync because we're only reading one value. > > If, by chance, the value is set to true just after we set it to false, > that won't mess us up either: we'll still read all the messages after > acquiring the lock. This approach would work if a spinlock release constrained the global stores timeline. It makes a weaker guarantee: all stores preceding the lock release in program order will precede it globally. Consequently, no processor will observe the spinlock to be available without also observing all stores made by the last holding processor before that processor released it. That guarantee is not enough for this sequence of events: Backend 0:add a message for table "a"store 5 => maxMsgNumstore true => timeToPayAttentionLWLockRelease(SInvalWriteLock) <plenty of time passes> Backend 2:LOCK TABLE a;load timeToPayAttention => true Backend 1:add a message for table "b"store 6 => maxMsgNumSpinLockRelease(&vsegP->msgnumLock);store true => timeToPayAttention Backend 2:store false => timeToPayAttentionLWLockAcquire(SInvalReadLock, LW_SHARED)load maxMsgNum => 5 [!] Backend 1:LWLockRelease(SInvalWriteLock); Backend 2:LOCK TABLE b;load timeToPayAttention => false<use b without processing updates> The "SpinLockRelease(&vsegP->msgnumLock)" guarantees that any process observing the backend 2 store of timeToPayAttention will also observe maxMsgNum = 6. However, nothing constrains which timeToPayAttention store will "win" in main memory. Backend 1 can observe neither update from backend 2, yet still have its store appear later than the backend 1 stores in the global timeline. > Now, there's some downside to this approach - it involves doing O(N) > work for each invalidation message we receive. But as long as it's > only O(N) stores and not O(N) lock acquisitions and releases, I think > that might be OK. I think a benchmark is in order, something like 900 idle connections and 80 connections running small transactions that create a few temporary tables. If that shows no statistically significant regression, then we're probably fine here. I'm not sure what result to expect, honestly. What did you think of making the message number a uint64, removing wraparound handling, and retaining SISeg.msgnumLock for 32-bit only? You could isolate the variant logic in READ_MSGNUM and WRITE_MSGNUM macros. -- Noah Misch http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: