Re: 001_rep_changes.pl stalls - Mailing list pgsql-hackers

From Noah Misch
Subject Re: 001_rep_changes.pl stalls
Date
Msg-id 20200420075954.GB1395671@rfd.leadboat.com
Whole thread Raw
In response to Re: 001_rep_changes.pl stalls  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: 001_rep_changes.pl stalls
List pgsql-hackers
On Mon, Apr 20, 2020 at 04:15:40PM +0900, Kyotaro Horiguchi wrote:
> At Sat, 18 Apr 2020 00:01:42 -0700, Noah Misch <noah@leadboat.com> wrote in 
> > On Fri, Apr 17, 2020 at 05:06:29PM +0900, Kyotaro Horiguchi wrote:
> > > At Fri, 17 Apr 2020 17:00:15 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > > > By the way, if latch is consumed in WalSndLoop, succeeding call to
> > > > WalSndWaitForWal cannot be woke-up by the latch-set.  Doesn't that
> > > > cause missing wakeups? (in other words, overlooking of wakeup latch).
> > > 
> > > - Since the only source other than timeout of walsender wakeup is latch,
> > > - we should avoid wasteful consuming of latch. (It is the same issue
> > > - with [1]).
> > > 
> > > + Since walsender is wokeup by LSN advancement via latch, we should
> > > + avoid wasteful consuming of latch. (It is the same issue with [1]).
> > > 
> > > 
> > > > If wakeup signal is not remembered on walsender (like
> > > > InterruptPending), WalSndPhysical cannot enter a sleep with
> > > > confidence.
> > 
> > No; per latch.h, "What must be avoided is placing any checks for asynchronous
> > events after WaitLatch and before ResetLatch, as that creates a race
> > condition."  In other words, the thing to avoid is calling ResetLatch()
> > without next examining all pending work that a latch would signal.  Each
> > walsender.c WaitLatch call does follow the rules.
> 
> I didn't meant that, of course.  I thought of more or less the same
> with moving the trigger from latch to signal then the handler sets a
> flag and SetLatch().  If we use bare latch, we should avoid false
> entering to sleep, which also makes thinks compolex.

I don't understand.  If there's a defect, can you write a test case or
describe a sequence of events (e.g. at line X, variable Y has value Z)?



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: WAL usage calculation patch
Next
From: Dilip Kumar
Date:
Subject: Re: fixing old_snapshot_threshold's time->xid mapping