Thread: Issue NOTICE for attempt to raise lock level?

Issue NOTICE for attempt to raise lock level?

From
Tom Lane
Date:
I am working on eliminating the "relation NNN modified while in use"
misfeature by instead grabbing a lock on each relation at first use
in a statement, and holding that lock till end of transaction.  The
main trick here is to make sure that the first lock grabbed is adequate
--- for example, it won't do to grab AccessShareLock and then have to
raise that to AccessExclusiveLock, because there will be a deadlock if
two backends do this concurrently.

To help debug this, I'm planning to add a little bit of code to the
lock manager that detects a request for a lock on an object on which
we already hold a lock of a lower level.  What I'm wondering about is
whether to make the report be elog(DEBUG) --- ie, send to postmaster
log only --- or elog(NOTICE), so that users would see it by default.

A NOTICE might be useful to users since it would complain about
deadlock-prone user-level coding practices too, such as
begin;select * from foo;    -- grabs read locklock table foo;        -- grabs exclusive lock

However, it might not be *very* useful, because the lock manager isn't
in a position to issue a message that's much more intelligible than
this:

NOTICE: Deadlock risk: raising lock level from 1 to 4 on object 85372/5732

(The lock level could be printed symbolically, but I doubt that very
much can be done with the object identifier --- it's not safe for the
lock manager to try to resolve relation OIDs to names, for example.)

Right now I'm thinking that this sort of notice would just create more
confusion than enlightenment for most users, so I'm inclined to make it
a DEBUG message.  But that's a judgment call, so I thought I'd throw
the issue out for discussion.  Any contrary opinions?
        regards, tom lane


Re: Issue NOTICE for attempt to raise lock level?

From
Peter Eisentraut
Date:
Tom Lane writes:

> To help debug this, I'm planning to add a little bit of code to the
> lock manager that detects a request for a lock on an object on which
> we already hold a lock of a lower level.  What I'm wondering about is
> whether to make the report be elog(DEBUG) --- ie, send to postmaster
> log only --- or elog(NOTICE), so that users would see it by default.

To me this seems to be a little like the much-disputed notice for adding
implicit range-table entries:  Either it's an error, then you abort, or
it's legal, then you leave the user alone and perhaps explain failure
scenarios in the documentation.  At least until we have something like a
user-configurable warning level.

elog(DEBUG) might be okay, but only with a positive DebugLvl, IMHO.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



RE: Issue NOTICE for attempt to raise lock level?

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane
> Sent: Wednesday, November 08, 2000 1:26 AM
> To: pgsql-hackers@postgreSQL.org
> Subject: [HACKERS] Issue NOTICE for attempt to raise lock level?
> 
> 
> I am working on eliminating the "relation NNN modified while in use"
> misfeature by instead grabbing a lock on each relation at first use
> in a statement, and holding that lock till end of transaction. 

Isn't "relation NNN modified while in use" itself coming from heap_
open(r) 's LockRelation_after_allocate sequence ?
Or from a rd_refcnt leak,of cource.
I'm thinking that RelationCacheInvalidate() should ignore relations
which are while in use. IMHO allocate_after_lock sequence is
needed for heap_open(r). 

> The
> main trick here is to make sure that the first lock grabbed is adequate
> --- for example, it won't do to grab AccessShareLock and then have to
> raise that to AccessExclusiveLock, because there will be a deadlock if
> two backends do this concurrently.
> 

I object to you if it also includes parse_rewrite_plan stage.
If there's a long transation it would also hold a AccessShareLock
on system tables for a long time. Then vacuum for system tables
would be blocked. Other transactions would be blocked......

Regards.
Hiroshi Inoue 


Re: Issue NOTICE for attempt to raise lock level?

From
Tom Lane
Date:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> I am working on eliminating the "relation NNN modified while in use"
>> misfeature by instead grabbing a lock on each relation at first use
>> in a statement, and holding that lock till end of transaction. 

> Isn't "relation NNN modified while in use" itself coming from heap_
> open(r) 's LockRelation_after_allocate sequence ?

Right.  I intend to eliminate that test entirely, and simply let the
relcache update happen.  With appropriate start-to-end locking, it
shouldn't be possible for a schema update to sneak in at an unsafe
point.

The only reason that there is a "modified while in use" test at all
is that I put it in awhile back as a stopgap solution until we did
a better job with the end-to-end locking problem.  The reports coming
back on 7.0.* make it clear that the stopgap answer isn't good enough,
so I want to fix it right for 7.1.

> I'm thinking that RelationCacheInvalidate() should ignore relations
> which are while in use.

Won't work unless you somehow store an "update needed" flag to make the
update happen later --- you can't just discard a shared-inval
notification.  And if you did that, you'd have synchronization issues
to contend with.  Since we use SnapshotNow for reading catalogs, catalog
fetches may see data that is inconsistent with the current state of the
relcache.  Not good.  Forcing the schema update to be held off in the
first place seems the right answer.

> I object to you if it also includes parse_rewrite_plan stage.
> If there's a long transation it would also hold a AccessShareLock
> on system tables for a long time.

No, I'm going to leave locking of system catalogs as-is.  This basically
means that we don't support concurrent alteration of schemas for system
tables.  Seems like an OK tradeoff to me ...
        regards, tom lane


Re: Issue NOTICE for attempt to raise lock level?

From
Hiroshi Inoue
Date:

Tom Lane wrote:

> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> >> I am working on eliminating the "relation NNN modified while in use"
> >> misfeature by instead grabbing a lock on each relation at first use
> >> in a statement, and holding that lock till end of transaction.
>
> > Isn't "relation NNN modified while in use" itself coming from heap_
> > open(r) 's LockRelation_after_allocate sequence ?
>
> Right.  I intend to eliminate that test entirely, and simply let the
> relcache update happen.  With appropriate start-to-end locking, it
> shouldn't be possible for a schema update to sneak in at an unsafe
> point.
>
> The only reason that there is a "modified while in use" test at all
> is that I put it in awhile back as a stopgap solution until we did
> a better job with the end-to-end locking problem.  The reports coming
> back on 7.0.* make it clear that the stopgap answer isn't good enough,
> so I want to fix it right for 7.1.
>
> > I'm thinking that RelationCacheInvalidate() should ignore relations
> > which are while in use.
>
> Won't work unless you somehow store an "update needed" flag to make the
> update happen later --- you can't just discard a shared-inval
> notification.

What I mean is to change heap_open(r) like
 LockRelationId(Name) -> shared-inval-handling -> allocate the relation descriptor and increment rd_refcnt

This would ensure that relations with rd_refcnt > 0
acquire some lock. Could any shared-inval-noti
fication arrive for such relations under the me-
chanism ? However 'reset system cache' message
could arrive at any time. I've examined the error
'recursive use of cache' for some time. It seems
very difficult to avoid the error if we reconstruct
relation descriptors whose rd_refcnt > 0 in
RelationCacheInvalidate().

Comments ?

Regards.
Hiroshi Inoue



Re: Issue NOTICE for attempt to raise lock level?

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> What I mean is to change heap_open(r) like
>   LockRelationId(Name) -> shared-inval-handling ->
>   allocate the relation descriptor and increment rd_refcnt
> This would ensure that relations with rd_refcnt > 0
> acquire some lock. Could any shared-inval-noti
> fication arrive for such relations under the me-
> chanism ?

Yes, because the system doesn't make any attempt to ensure that relcache
entries are held open throughout a statement or transaction.  (If they
were, we largely wouldn't have a problem.)  So we can't use relcache
refcount going from 0 to 1 as the sole criterion for when to acquire
a lock.

I did look at using the relcache to control holding locks throughout
statements, but it seems that it doesn't have enough information
to grab the right kind of lock.  For example, I had to modify the
parser to ensure that the right kind of lock is grabbed on the
initial relcache access, depending on whether the table involved is
accessed for plain SELECT, SELECT FOR UPDATE, or INSERT/UPDATE/DELETE.
I still have to make a similar change in the rewriter for table
references that are added to a query by rewrite.  The code that is
doing this stuff knows full well that it is making the first reference
to a table, and so the relcache doesn't really have anything to
contribute.

> However 'reset system cache' message
> could arrive at any time. I've examined the error
> 'recursive use of cache' for some time. It seems
> very difficult to avoid the error if we reconstruct
> relation descriptors whose rd_refcnt > 0 in
> RelationCacheInvalidate().

I haven't had time to look at that yet, but one possible answer is just
to disable the 'recursive use of cache' test.  It's only a debugging
sanity-check anyway, not essential functionality.
        regards, tom lane