Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad
Date
Msg-id CA+hUKG+Fug3=Hw2dAiVEMqbC-F_1tKNo_Ev5q6-5Ww7aUGKDWw@mail.gmail.com
Whole thread Raw
In response to Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sun, Feb 27, 2022 at 10:29 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> On Sun, Feb 27, 2022 at 06:10:45PM +0900, Michael Paquier wrote:
> > On Sat, Feb 26, 2022 at 01:39:42PM -0800, Andres Freund wrote:
> > > I suspect the easiest is to just convert that usleep to a WaitLatch(). That'd
> > > require adding a new enum value to WaitEventTimeout in 14. Which probably is
> > > fine?
> >
> > We've added wait events in back-branches in the past, so this does not
> > strike me as a problem as long as you add the new entry at the end of
> > the enum, while keeping things ordered on HEAD.
>
> +1

+1

Sleeps like these are also really bad for ProcSignalBarrier, which I
was expecting to be the impetus for fixing any remaining loops like
this.

With the attached, 027_stream_regress.pl drops from ~29.5s to ~19.6s
on my FreeBSD workstation!

It seems a little strange to introduce a new wait event that will very
often appear into a stable branch, but ... it is actually telling the
truth, so there is that.

The sleep/poll loop in RegisterSyncRequest() may also have another
problem.  The comment explains that it was a deliberate choice not to
do CHECK_FOR_INTERRUPTS() here, which may be debatable, but I don't
think there's an excuse to ignore postmaster death in a loop that
presumably becomes infinite if the checkpointer exits.  I guess we
could do:

-               pg_usleep(10000L);
+               WaitLatch(NULL, WL_EXIT_ON_PM_DEATH | WL_TIMEOUT, 10,
WAIT_EVENT_SYNC_REQUEST);

But... really, this should be waiting on a condition variable that the
checkpointer broadcasts on when the queue goes from full to not full,
no?  Perhaps for master only?

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Missed condition-variable wakeups on FreeBSD
Next
From: Japin Li
Date:
Subject: Re: CREATE DATABASE IF NOT EXISTS in PostgreSQL