Notice: heap_open/close changes committed - Mailing list pgsql-hackers

From Tom Lane
Subject Notice: heap_open/close changes committed
Date
Msg-id 20706.937683095@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
I have just committed a ton of changes to make heap_open/heap_openr
take an additional argument specifying the kind of lock to grab on
the relation (or you can say "NoLock" to get the old, no-lock behavior).
Similarly, heap_close takes a new argument giving the kind of lock to
release, or "NoLock" to release no lock.  (If you don't release the
lock you got at open time, it's implicitly held till transaction end.)

This should go a long way towards fixing problems caused by a concurrent
VACUUM moving tuples in system relations.  I also fixed several bugs
(first spotted by Hiroshi) having to do with not getting a sufficient
lock on a relation being DROPped, ALTERed, etc.  There may be more of
those still lurking, though.

There are a couple of coding rules that ought to be pointed out:

1. If you specify a lock type to heap_open/openr, then heap_open will
elog(ERROR) if the relation cannot be found --- since it can't get the
lock in that case, obviously.  You must use NoLock (and then lock later,
if appropriate) if you want to be able to recover from no-such-relation.
This allowed extra code to test for no-such-rel to be removed from many
call sites.  There were a lot of other call sites that neglected to test
for failure return at all, so they're now a little more robust.

2. I made most opens of system relations grab AccessShareLock if
read-only, or RowExclusiveLock if read-write, on the theory that
these accesses correspond to an ordinary search or update of a user
relation.  This maximizes concurrency of access to the system tables.
It should be sufficient to have AccessExclusiveLock on the user relation
being modified in order to do most things like dropping/modifying
relations.  Note however that we release these locks as soon as we
close the system relations, whereas the lock on the underlying relation
needs to be held till transaction commit to ensure that other users of
the relation will see whatever you did.  There may be a few cases where
we need a stronger lock on system relations...

3. If you are doing a SearchSysCache (any flavor) to find a tuple that
you intend to modify/delete, you must open and lock the containing
relation *before* the search, not after.  Otherwise you may retrieve
a tuple containing a stale physical location pointer (because VACUUM
has just moved it), which will cause the heap_replace or heap_delete
to crash and burn.  There were a few places that did the heap_open
after getting the tuple.  Naughty naughty.


I did not change locking behavior for index_open/close --- for the
most part we just acquire AccessShareLock on an index during
index_beginscan.  I am not sure if this is good or not.  I did make
index_open do elog(ERROR) on failure, because most call sites weren't
checking for failure...

These changes do *not* include Hiroshi's recent proposal to add
time qual checking in SearchSysCache.  I suspect he is right but
would like confirmation from someone who actually understands the
time qual code ;-)
        regards, tom lane

PS: you should do "make clean all" after pulling these changes.


pgsql-hackers by date:

Previous
From: Adriaan Joubert
Date:
Subject: Patches for alpha w. cc
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] case bug?