Marko Tiikkaja <marko@joh.to> writes:
> In the meanwhile I'll be happy to provide more information if someone
> has any ideas.
I don't suppose there's anything in the postmaster log suggesting trouble
with accessing pg_clog/ files shortly before the lockup?
I concur with Alvaro's assessment that this looks like a bunch of
processes all waiting for somebody else to finish reading the clog page
they want. If somebody had forgotten to unlock the I/O lock after loading
a clog buffer, we could get this symptom later on; but I find it hard to
see how that could happen.
BTW ... just looking at the code a bit ... I wonder whether there is any
interlock that ensures that listen-queue messages will be cleared before
the originating transaction's entry is truncated away from pg_clog?
It doesn't seem like an ancient XID in the message queue could cause this
particular symptom, but I could see it leading to "could not access
status of transaction" failures.
regards, tom lane