Re: reviewing the "Reduce sinval synchronization overhead" patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: reviewing the "Reduce sinval synchronization overhead" patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4
Date
Msg-id CA+TgmoYmTWP6vk-mv7VM3TPcKN7HKWhkuMrD_dKMHhww-GgQRg@mail.gmail.com
Whole thread Raw
In response to reviewing the "Reduce sinval synchronization overhead" patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4  (Nils Goroll <slink@schokola.de>)
Responses Re: reviewing the "Reduce sinval synchronization overhead" patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4
List pgsql-hackers
On Tue, Aug 21, 2012 at 12:14 PM, Nils Goroll <slink@schokola.de> wrote:
> I am reviewing this one year old change again before backporting it to 9.1.3
> for production use.
>
> ATM, I believe the code is correct, but I don't want to miss the change to
> spot possible errors, so please let me dump my brain on some points:
>
> - IIUC, SIGetDataEntries() can return 0 when in fact there _are_ messages
>   because  stateP->hasMessages could come from a stale cache (iow there is
> no
>   read-membar used and because we return before acquiring SInvalReadLock
> (which
>   the patch is all about in the first place), we don't get an implicit
>   read-membar from a lock op any more).

Right.

>   What I can't judge on: Would this cause any harm? What are the
> consequences
>   of SIGetDataEntries returning 0 after another process has posted a message
>   (looking at global temporal ordering)?
>
>   I don't quite understand the significance of the respective comment in the
>   code that the incoherence should be acceptable because the cached read
> can't
>   migrate to before a previous lock acquisition (which itself is clear).

Our sinval synchronization mechanism has a somewhat weird design that
makes this OK.  Sinval basically exists to support backend-local
caching, and any given piece of data that's being cached is
conceptually protected by some heavyweight lock L, taken normally in
access-share mode.  That means that, before relying on a backend-local
cache to be up to date, you must take that heavyweight lock, which
will call AcceptInvalidationMessages().  The fact that you've
successfully taken that heavyweight lock means that nobody else is
changing the data you care about, because to do that they would have
needed a conflicting lock i.e. access-exclusive mode.  So the guy
modifying the data logically does this:

T0. take lock in access-exclusive mode
T1. change stuff
T2. send invalidation messages
T3. release lock

While the other guy does this:

U0. take lock in access-share mode
U1. receive invalidation messages
U2. rebuild cache if necessary
U3. release lock

Step U1 cannot occur before step U0 (because lock acquisition is a
memory barrier).   Step T2 cannot occur after step T3 (because lock
release is a memory barrier).  And step U0 cannot occur before step T3
(because the locks conflict).  So the ordering is necessarily
T2->T3->U0->U1; thus, T2 must happen before U1 and we're OK.

Now, it is still true that if the lock taken U0 is *a different lock*
than the one release in T3 then there's no ordering between T2 and U1,
so U1 could miss invalidation messages that wipe out *some cache other
than the one it's about to examine*.  But it can't miss the ones for
the cache that it actually cares about.  Woohoo!

>   AcceptInvalidationMessages has a comment that it should be the first thing
>   to do in a transaction, and I am not sure if all the consumers have a
>   read-membar equivalent operation in place.

The really important call site for this purpose is the one in LockRelationOid().

>   How bad would a missed cache invalidation be? Should we have a read-membar
>   in SIGetDataEntries just to be safe?

Not needed, per the above.  We should not add memory barriers anywhere
without a precise definition of what problem we're fixing.  They are
not free, and we don't want to get into the habit of inserting them as
ill-considered insurance against problems we don't fully understand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: David Gould
Date:
Subject: Re: huge tlb support
Next
From: Mathieu Fenniak
Date:
Subject: restartpoints stop generating on streaming-replication slave