Thread: Removal of AcceptInvalidationMessages broke things

Removal of AcceptInvalidationMessages broke things

From

Tom Lane

Date:

19 September 2012, 17:17:22

I looked into bug #7557, which demonstrates a case where a session fails
to notice a just-committed change in table permissions.  This is pretty
obviously due to a failure to read the sinval message notifying other
backends of the pg_class.relacl update.  Some digging in the git history
says it got broken here:

commit 4240e429d0c2d889d0cda23c618f94e12c13ade7
Author: Robert Haas <rhaas@postgresql.org>
Date:   Fri Jul 8 22:19:30 2011 -0400
   Try to acquire relation locks in RangeVarGetRelid.      In the previous coding, we would look up a relation in
RangeVarGetRelid,  lock the resulting OID, and then AcceptInvalidationMessages().  While   this was sufficient to
ensurethat we noticed any changes to the   relation definition before building the relcache entry, it didn't   handle
thepossibility that the name we looked up no longer referenced   the same OID.  This was particularly problematic in
thecase where a   table had been dropped and recreated: we'd latch on to the entry for   the old relation and fail
lateron.  Now, we acquire the relation lock   inside RangeVarGetRelid, and retry the name lookup if we notice that
invalidationmessages have been processed meanwhile.  Many operations   that would previously have failed with an error
inthe presence of   concurrent DDL will now succeed.
 

because that commit removed this bit from relation_openrv:

-   /*
-    * Check for shared-cache-inval messages before trying to open the
-    * relation.  This is needed to cover the case where the name identifies a
-    * rel that has been dropped and recreated since the start of our
-    * transaction: if we don't flush the old syscache entry then we'll latch
-    * onto that entry and suffer an error when we do RelationIdGetRelation.
-    * Note that relation_open does not need to do this, since a relation's
-    * OID never changes.
-    *
-    * We skip this if asked for NoLock, on the assumption that the caller has
-    * already ensured some appropriate lock is held.
-    */
-   if (lockmode != NoLock)
-       AcceptInvalidationMessages();

and there are no other places where a transaction will do
AcceptInvalidationMessages if it thinks it already holds lock on the
tables it's working with.

Since we have only a few hours before 9.2.1 is due to wrap, my
inclination for a band-aid fix is to put back that code.  There might be
some more elegant answer, but we haven't got time to find it now.

Comments?
        regards, tom lane

Re: Removal of AcceptInvalidationMessages broke things

From

Noah Misch

Date:

19 September 2012, 18:17:49

On Wed, Sep 19, 2012 at 01:17:17PM -0400, Tom Lane wrote:
> I looked into bug #7557, which demonstrates a case where a session fails
> to notice a just-committed change in table permissions.

> -   /*
> -    * Check for shared-cache-inval messages before trying to open the
> -    * relation.  This is needed to cover the case where the name identifies a
> -    * rel that has been dropped and recreated since the start of our
> -    * transaction: if we don't flush the old syscache entry then we'll latch
> -    * onto that entry and suffer an error when we do RelationIdGetRelation.
> -    * Note that relation_open does not need to do this, since a relation's
> -    * OID never changes.
> -    *
> -    * We skip this if asked for NoLock, on the assumption that the caller has
> -    * already ensured some appropriate lock is held.
> -    */
> -   if (lockmode != NoLock)
> -       AcceptInvalidationMessages();
> 
> and there are no other places where a transaction will do
> AcceptInvalidationMessages if it thinks it already holds lock on the
> tables it's working with.
> 
> Since we have only a few hours before 9.2.1 is due to wrap, my
> inclination for a band-aid fix is to put back that code.  There might be
> some more elegant answer, but we haven't got time to find it now.

Sounds fine for now.  I suspect the better change would be to make
AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
There's no reason to desire recent ACLs only when opening by name.

Re: Removal of AcceptInvalidationMessages broke things

From

Tom Lane

Date:

19 September 2012, 18:25:10

Noah Misch <noah@leadboat.com> writes:
> On Wed, Sep 19, 2012 at 01:17:17PM -0400, Tom Lane wrote:
>> Since we have only a few hours before 9.2.1 is due to wrap, my
>> inclination for a band-aid fix is to put back that code.  There might be
>> some more elegant answer, but we haven't got time to find it now.

> Sounds fine for now.  I suspect the better change would be to make
> AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
> There's no reason to desire recent ACLs only when opening by name.

I think it's enough for now because the first access to a relation in a
statement is always a name-based lookup from the parser.  Were that not
sufficient, we'd have had complaints before.

The core problem really is that GRANT/REVOKE don't take any object-level
lock on what they're changing.  A "real" fix might require sprinkling
AcceptInvalidationMessages calls into aclchk.c, but I'm unsure of the
performance costs of that.  Anyway, today is not the time to design
something better.
        regards, tom lane

Re: Removal of AcceptInvalidationMessages broke things

From

Robert Haas

Date:

19 September 2012, 18:53:06

On Wed, Sep 19, 2012 at 2:17 PM, Noah Misch <noah@leadboat.com> wrote:
> Sounds fine for now.  I suspect the better change would be to make
> AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
> There's no reason to desire recent ACLs only when opening by name.

I agree, on both counts.  I think I failed to realize when doing that
refactoring that I was reducing the number of cases where
AcceptInvalidationMessages() would actually be called.  AFAICS, the
only reason why we have such complicated rules for calling that
function is that it's traditionally been expensive.  But that should
be much less true due now due to improvements in 9.2 (cf commit
b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4).  So we can probably afford
to be a little less byzantine about the way we do this now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company