Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring - Mailing list pgsql-bugs

From Tom Lane
Subject Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring
Date
Msg-id 26770.1175193487@sss.pgh.pa.us
Whole thread Raw
In response to Re: postmaster 8.2 eternally hangs in sempaphore lock acquiring  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
I wrote:
> It's possible that this is not a deadlock per se, but the aftermath of
> someone having errored out without releasing the BtreeVacuumLock --- but
> I don't entirely see how that could happen either, at least not without
> a core dump scenario.

On closer inspection, the autovac stack trace

#4 0x080abe38 in _bt_end_vacuum (rel=0xb5f0b298) at nbtutils.c:1028
#5 0x080a9c68 in btbulkdelete (fcinfo=0xbfc58cd8) at nbtree.c:552

suggests that _bt_end_vacuum is called from the CATCH part of
btbulkdelete, and that provides an idea: if either of the elog(ERROR)
calls in _bt_start_vacuum were to actually fire, it would throw control
without having released BtreeVacuumLock, and then _bt_end_vacuum would
hang up.  _bt_start_vacuum is coded on the assumption that the LWLock
would get released by transaction abort cleanup, but we'd fail before
getting there.  So this is definitely a bug, but the next question is
what's triggering it --- both of those elogs should be "can't happen"
conditions.

> Is there anything in the postmaster log when this happens?

I repeat that with more urgency.  Do you see any
"multiple active vacuums for index \"%s\"" or "out of btvacinfo slots"
log messages when these hangups occur?

            regards, tom lane

pgsql-bugs by date:

Previous
From: "zaky"
Date:
Subject: undefined symbol: krb5_cc_get_principal
Next
From: Tom Lane
Date:
Subject: Re: undefined symbol: krb5_cc_get_principal