Re: Make relation_openrv atomic wrt DDL - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Make relation_openrv atomic wrt DDL
Date
Msg-id 20110613051233.GI21098@tornado.leadboat.com
Whole thread Raw
In response to Re: Make relation_openrv atomic wrt DDL  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Make relation_openrv atomic wrt DDL
List pgsql-hackers
On Sun, Jun 12, 2011 at 10:56:41PM -0400, Robert Haas wrote:
> On Sun, Jun 12, 2011 at 9:23 PM, Noah Misch <noah@leadboat.com> wrote:
> >> I haven't reviewed
> >> your patch in detail, but is there a way we can encapsulate the
> >> knowledge of the invalidation system down inside the sinval machinery,
> >> rather than letting the heap code have to know directly about the
> >> counter? ?Perhaps AIV() could return true or false depending on
> >> whether any invalidation messages were processed, or somesuch.
> >
> > I actually did it exactly that way originally. ?The problem was the return value
> > only applying to the given call, while I wished to answer a question like "Did
> > any call to AcceptInvalidationMessages() between code point A and code point B
> > process a message?" ?Adding AcceptInvalidationMessages() calls to code between A
> > and B would make the return value test yield a false negative. ?A global counter
> > was the best thing I could come up with that avoided this hazard.
> 
> Oh, interesting point.  What if AcceptInvalidationMessages returns the
> counter?  Maybe with typedef uint32 InvalidationPositionId or
> something like that, to make it partially self-documenting, and
> greppable.

That might be a start, but it's not a complete replacement for the global
counter.  AcceptInvalidationMessages() is actually called in LockRelationOid(),
but the comparison needs to happen a level up in RangeVarLockRelid().  So, we
would be adding encapsulation in one place to lose it in another.  Also, in the
uncontended case, the patch only calls AcceptInvalidationMessages() once per
relation_openrv.  It compares the counter after that call with a counter as the
last caller left it -- RangeVarLockRelid() doesn't care who that caller was.

> Taking that a bit further, what if we put that counter in
> shared-memory?  After writing new messages into the queue, a writer
> would bump this count (only one process can be doing this at a time
> because SInvalWriteLock is held) and memory-fence.  Readers would
> memory-fence and then read the count before acquiring the lock.  If it
> hasn't changed since we last read it, then don't bother acquiring
> SInvalReadLock, because no new messages have arrived.  Or maybe an
> exact multiple of 2^32 messages have arrived, but there's probably
> someway to finesse around that issue, like maybe also using some kind
> of memory barrier to allow resetState to be checked without the lock.

This probably would not replace a backend-local counter of processed messages
for RangeVarLockRelid()'s purposes.  It's quite possibly a good way to reduce
SInvalReadLock traffic, though.

Exact multiples of 2^32 messages need not be a problem, because the queue is
limited to MAXNUMMESSAGES (4096, currently).  I think you will need to pack into
one 32-bit value all data each backend needs to decide whether to proceed with
the full process.  Given that queue offsets fit into 13 bits (easily reduced to
12) and resetState is a bit, that seems practical enough at first glance.

nm


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Range Types and extensions
Next
From: Shigeru Hanada
Date:
Subject: Re: FOREIGN TABLE doc fix