Thread: deadlock in REINDEX

deadlock in REINDEX

From
Neil Conway
Date:
I noticed a pretty obscure deadlock condition with REINDEX in CVS HEAD:

client1:

nconway=# create table a (b int unique, c int unique);
CREATE TABLE
nconway=# begin;
BEGIN
nconway=# lock table a in access exclusive mode;
LOCK TABLE

client2:

nconway=# reindex index a_b_key;
< blocks, waiting to acquire an access exclusive lock on the heap
relation "a" >

client1:

nconway=# reindex table a;
ERROR:  deadlock detected

Naturally, this situation is not a very common one. But it seems to me
that the practice of acquiring locks in REINDEX in an inconsistent order
is asking for trouble: REINDEX TABLE locks the heap rel first, followed
by any indexes of the heap rel, but REINDEX INDEX locks the target
index, followed by the heap rel. Hence a deadlock condition (the
explicit lock table above just serves to make the window of opportunity
much larger).

This should be fixed, right?

I was thinking of changing reindex_index() to acquire an AccessShareLock
on the index in question, find its parent rel ID, release the lock, then
acquire an AccessExclusiveLock on the parent rel, followed by an
AccessExclusiveLock on the index in question.

Comments?

Cheers,

Neil

P.S. I noticed this because I was browsing through REINDEX, trying to
see if it would be possible to allow it to acquire less exclusive
locks...
-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: deadlock in REINDEX

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Naturally, this situation is not a very common one. But it seems to me
> that the practice of acquiring locks in REINDEX in an inconsistent order
> is asking for trouble: REINDEX TABLE locks the heap rel first, followed
> by any indexes of the heap rel, but REINDEX INDEX locks the target
> index, followed by the heap rel. Hence a deadlock condition (the
> explicit lock table above just serves to make the window of opportunity
> much larger).

> This should be fixed, right?

Only if the cure isn't worse than the disease.

> I was thinking of changing reindex_index() to acquire an AccessShareLock
> on the index in question, find its parent rel ID, release the lock, then
> acquire an AccessExclusiveLock on the parent rel, followed by an
> AccessExclusiveLock on the index in question.

If you release the lock then I think you are opening yourself to worse
troubles than this one, having to do with someone renaming/deleting the
table and/or index out from under you.  The fact that REINDEX INDEX
might fail if there are concurrent conflicting operations doesn't bother
me a whole lot; but not holding a lock throughout the operation does.

AFAICS, REINDEX INDEX is only a disaster-recovery tool anyway, and so is
not likely to be run in parallel with other operations.  The scenarios
I can think of where you might want to do REINDEX routinely would always
use REINDEX TABLE, I should think.

BTW, I imagine DROP INDEX has a similar issue, and CLUSTER might
depending on what it locks first (but it would be easy to fix it
to lock the table first, since it has both names).
        regards, tom lane


Re: deadlock in REINDEX

From
Neil Conway
Date:
On Mon, 2003-02-17 at 18:39, Tom Lane wrote:
> > I was thinking of changing reindex_index() to acquire an AccessShareLock
> > on the index in question, find its parent rel ID, release the lock, then
> > acquire an AccessExclusiveLock on the parent rel, followed by an
> > AccessExclusiveLock on the index in question.
> 
> If you release the lock then I think you are opening yourself to worse
> troubles than this one, having to do with someone renaming/deleting the
> table and/or index out from under you.

Presumably, the renaming/deleting operation acquires an exclusive lock
and then holds it until transaction commit, right? If so, then wouldn't
we still be okay: the REINDEX would lock the index in access share mode,
find the OID of the heap rel, unlock the index, lock the heap rel in
access exclusive mode, then try to re-open & lock the index, find that
it no longer exists and then elog(ERROR).

Whether or not that solution actually works, ISTM there must be *some*
method of locking that is free of deadlocks -- saying "oh well, it's not
a common case anyway" doesn't strike me as being satisfactory :-\

Cheers,

Neil
-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: deadlock in REINDEX

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> On Mon, 2003-02-17 at 18:39, Tom Lane wrote:
>> If you release the lock then I think you are opening yourself to worse
>> troubles than this one, having to do with someone renaming/deleting the
>> table and/or index out from under you.

> Presumably, the renaming/deleting operation acquires an exclusive lock
> and then holds it until transaction commit, right? If so, then wouldn't
> we still be okay: the REINDEX would lock the index in access share mode,
> find the OID of the heap rel, unlock the index, lock the heap rel in
> access exclusive mode, then try to re-open & lock the index, find that
> it no longer exists and then elog(ERROR).

That approach might be deadlock-free, but that doesn't mean it is
surprise-free.  For example, if the other guy did an ALTER TABLE RENAME
on the index, it'd be possible that what you are actually reindexing
is now differently named than it was before (and, perhaps, there is now
some other index that has the original name and is the one the user
really meant).  This is not so dangerous in the REINDEX case, maybe,
but it could be unhappy-making in the DROP case.

> Whether or not that solution actually works, ISTM there must be *some*
> method of locking that is free of deadlocks -- saying "oh well, it's not
> a common case anyway" doesn't strike me as being satisfactory :-\

Basically, I'm not convinced that a deadlock failure is so much worse
than any other failure that we should open ourselves to other surprises
in order to avoid a deadlock.  It's a judgment call though.  Any other
comments out there?
        regards, tom lane


Re: deadlock in REINDEX

From
Gavin Sherry
Date:
On Mon, 17 Feb 2003, Tom Lane wrote:

> Neil Conway <neilc@samurai.com> writes:
> > On Mon, 2003-02-17 at 18:39, Tom Lane wrote:
> >> If you release the lock then I think you are opening yourself to worse
> >> troubles than this one, having to do with someone renaming/deleting the
> >> table and/or index out from under you.
> 
> > Presumably, the renaming/deleting operation acquires an exclusive lock
> > and then holds it until transaction commit, right? If so, then wouldn't
> > we still be okay: the REINDEX would lock the index in access share mode,
> > find the OID of the heap rel, unlock the index, lock the heap rel in
> > access exclusive mode, then try to re-open & lock the index, find that
> > it no longer exists and then elog(ERROR).
> 
> That approach might be deadlock-free, but that doesn't mean it is
> surprise-free.  For example, if the other guy did an ALTER TABLE RENAME

Perhaps the change that needs to be made is:

if(IsUnderPostmaster)elog(ERROR,"You cannot run REINDEX INDEX in multi-user mode");

to ReindexIndex() or some other appropriate place (with a better error
message).

Gavin



Re: deadlock in REINDEX

From
Olleg Samoylov
Date:
On Tue, 18 Feb 2003, Gavin Sherry wrote:

GS>Perhaps the change that needs to be made is:
GS>
GS>if(IsUnderPostmaster)
GS>    elog(ERROR,"You cannot run REINDEX INDEX in multi-user mode");
GS>
GS>to ReindexIndex() or some other appropriate place (with a better error
GS>message).
GS>
GS>Gavin

It is incorrect. In PostgreSQL REINDEX must be routing (PostgreSQL 7.3.1
Documentation 8.3. Routine Reindexing).

-- 
Olleg Samoylov