From: "Robert Haas" <robertmhaas@gmail.com>
> I think the problem here is that it actually is possible for one
> session to access the temporary objects of another session:
> Now, we could prohibit that specific thing. But at the very least, it
> has to be possible for one session to drop another session's temporary
> objects, because autovacuum does it eventually, and superusers will
> want to do it sooner to shut autovacuum up. So it's hard to reason
> about whether and to what extent it's safe to not send sinval messages
> for temporary objects.
I was a bit surprised to know that one session can access the data of
another session's temporary tables. That implenentation nay be complicating
the situation -- extra sinval messages.
> I think you might be approaching this problem from the wrong end,
> though. The question in my mind is: why does the
> StartTransactionCommand() / CommitTransactionCommand() pair in
> ProcessCatchupEvent() end up writing a commit record? The obvious
> possibility that occurs to me is that maybe rereading the invalidated
> catalog entries causes a HOT prune, and maybe there ought to be some
> way for a transaction that has only done HOT pruning to commit
> asynchronously, just as we already do for transactions that only
> modify temporary tables. Or, failing that, maybe there's a way to
> suppress synchronous commit for this particular transaction.
I could figure out what log record was output in the transaction started in
ProcessCatchupEvent() by inserting elog() in XLogInsert(). The log record
was (RM_STANDBY_ID, XLOG_STANDBY_LOCK).
The cause of the hang turned out clear. It was caused as follows:
1. When a transaction commits which used a temporary table created with ON
COMMIT DELETE ROWS, the sinval catchup signal (SIGUSR1) was issued from
smgrtruncate(). This is because the temporary table is truncated at
transaction end.
2. Another session, which is waiting for a client request, receives SIGUSR1.
It calls ProcessCatchupEvent().
3. ProcessCatchupEvent() calls StartTransactionCommand(), emits the
XLOG_STANDBY_LOCK WAL record, and then calls CommitTransactionCommand().
4. It then calls SyncRepWaitForLSN(), which in turn waits on the latch.
5. But the WaitLatch() never returns, because the session is already running
inside the SIGUSR1 handler in step 2. WaitLatch() needs SIGUSR1 to
complete.
I think there is a problem with the latch and SIGUSR1 mechanism. How can we
fix this problem?
Regards
MauMau