Re: win32 _dosmaperr() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: win32 _dosmaperr()
Date
Msg-id 20428.1123961668@sss.pgh.pa.us
Whole thread Raw
In response to Re: win32 _dosmaperr()  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Qingqing Zhou wrote:
>> Things could get worse because the whole database cluster may stop working
>> and waiting for the buffer the bgwriter is working on, but bgwriter is
>> waiting for (by the deadloop in pgunlink) those postgres'es to move on (so
>> that they could close the problematic xlog segment), which is a deadlock.

I think that analysis is bogus.  The bgwriter only tries to unlink xlog
segments during post-checkpoint cleanup, at which point it isn't holding
any buffer locks.  Likewise, while backends might wait trying to remove
a table file because the bgwriter has the file open, in that state they
aren't blocking the bgwriter either.

In the latter case, the backends will have to wait till the bgwriter
closes the file, which it'll do not later than the next checkpoint.
I wonder whether the complaints are coming from people who don't know
about that, and didn't wait long enough?

There could be a deadlock if a backend is holding open an old xlog
segment while it executes a CHECKPOINT command, because then it'll
wait for the bgwriter, and the bgwriter might think it could remove
the xlog file during the checkpoint.

Another form could only happen between two backends: A is trying to
unlink file F, which backend B has open, and then for some unrelated
reason B has to wait for a lock held by A.  The bgwriter doesn't take
nor wait for locks so this doesn't apply to it.

But none of this should be happening because we're supposedly always
opening all these files with the magic sharing flag.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: distributed performance testing
Next
From: Brendan Jurd
Date:
Subject: Re: gettime() - a timeofday() alternative