Re: Tricky bugs in concurrent index build - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Tricky bugs in concurrent index build
Date
Msg-id 13849.1156614154@sss.pgh.pa.us
Whole thread Raw
In response to Re: Tricky bugs in concurrent index build  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Tricky bugs in concurrent index build
List pgsql-hackers
I wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
>> b) You introduced a LockRelationIdForSession() call (I even didn't realize we
>> had this capability when I wrote the patch). Does this introduce the
>> possibility of a deadlock though?

> Indeed, I had noticed this while testing your point (a).  I thought that
> ConditionalLockRelation had enough smarts to grant itself a lock if
> there would be a deadlock possibility otherwise, but it seems not to be
> noticing.  We'll need to look into that.

OK, the code I was remembering is the bit in ProcSleep to grant our proc
the lock if we can prove that deadlock detection would do so anyway.
It is probably possible to refactor things so that that's done before
we fail in the ConditionalLockAcquire case, but it would involve
uglifying ProcSleep's API even more.  And it doesn't sound like a
complete solution anyway --- the ProcSleep test is not capable of
detecting deadlocks involving more than one lock, and I'm not sure it
even intends to find all cases involving just one lock.  It was only
ever meant as a simple optimization to avoid a sleep in easy cases.

>> To solve that we would have to replace the pg_sleep call with a
>> XactLockTableWait. But I'm not clear how to find a transaction id to wait
>> on.

I'm toying with the idea of adding a lock manager call defined as
"give me a list of XIDs that currently hold locks conflicting with
lockmode X on object Y" --- but not including XIDs merely waiting
for such a lock.  Then we could get the list of XIDs currently blocking
ExclusiveLock, and do XactLockTableWait on each one.  Once they are
all gone we are safe; we don't care about later acquirers of locks
because they must see the index.  This avoids both the arbitrary
pg_sleep and the need for a check on "am I the oldest remaining XID";
and if we are in a deadlock situation it will be detected.

This looks like it should be easy enough to code (though I haven't
actually tried yet) ... do you see any logical flaws?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [DOCS] New XML section for documentation
Next
From: Chahine Hamila
Date:
Subject: Re: integration of pgcluster into postgresql