Re: reindex and copy - deadlock? - Mailing list pgsql-performance

From Tom Lane
Subject Re: reindex and copy - deadlock?
Date
Msg-id 12457.1086969920@sss.pgh.pa.us
Whole thread Raw
In response to Re: reindex and copy - deadlock?  (Litao Wu <litaowu@yahoo.com>)
Responses Re: reindex and copy - deadlock?  (Litao Wu <litaowu@yahoo.com>)
List pgsql-performance
Litao Wu <litaowu@yahoo.com> writes:
> (gdb) bt
> #0  0x420e8bb2 in semop () from /lib/i686/libc.so.6
> #1  0x080ffa28 in PGSemaphoreLock ()
> #2  0x08116432 in LWLockAcquire ()
> #3  0x0810f572 in LockBuffer ()
> #4  0x0807dea3 in _bt_getbuf ()
> #5  0x080813ec in _bt_leafbuild ()
> #6  0x080816a6 in _bt_leafbuild ()
> #7  0x08081b8b in _bt_leafbuild ()
> #8  0x080813cc in _bt_leafbuild ()
> #9  0x0807e1d0 in btbuild ()
> #10 0x081631c3 in OidFunctionCall3 ()
> #11 0x080920a7 in index_build ()
> #12 0x08092593 in reindex_index ()

Hmm.  I don't think I believe this backtrace.  It's obviously wrong at
lines 5-7 - _bt_leafbuild doesn't call itself nor call _bt_getbuf.
It's possible that you don't have any local symbols in this executable
and what we're seeing is the nearest global symbol, so let's ignore
that; but if we take lines 0-4 at face value, what it says is that the
REINDEX is stuck waiting for buffer lock on a buffer for a new empty
page it has just added to the new index.  This is flatly impossible.
There is no other process that could possibly be interested in that
buffer, or for that matter even be able to name it (since the new index
has a new relfilenode value that isn't even visible to any other process
yet).  I thought for a little bit that a background CHECKPOINT might be
trying to write out the new buffer, but that theory holds no water
either, because at this point in the _bt_getbuf sequence, the buffer is
not marked dirty (I just verified this by stepping through it in 7.4.2).

I can think of lots of reasons why the REINDEX might block at the
previous step of the sequence, namely acquiring a fresh buffer ... but
once it's got the buffer there is surely no reason to block.

What I'm inclined to think is that the backtrace isn't right at all.
Would it be possible for you to install a backend built with
--enable-debug and get a more reliable backtrace?

            regards, tom lane

pgsql-performance by date:

Previous
From: Litao Wu
Date:
Subject: Re: reindex and copy - deadlock?
Next
From: Tom Lane
Date:
Subject: Re: ORDER BY user defined function performance issues