Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad
Date
Msg-id CA+hUKGJf4iF2tnLfN5zWYPjBNkRSNsQOmmTrDpYw-cBrSyrGfg@mail.gmail.com
Whole thread Raw
In response to Re: Checkpointer sync queue fills up / loops around pg_usleep() are bad  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Mar 2, 2022 at 10:58 AM Andres Freund <andres@anarazel.de> wrote:
> On 2022-03-02 06:46:23 +1300, Thomas Munro wrote:
> > From a9344bb2fb2a363bec4be526f87560cb212ca10b Mon Sep 17 00:00:00 2001
> > From: Thomas Munro <thomas.munro@gmail.com>
> > Date: Mon, 28 Feb 2022 11:27:05 +1300
> > Subject: [PATCH v2 1/3] Wake up for latches in CheckpointWriteDelay().
>
> LGTM. Would be nice to have this fixed soon, even if it's just to reduce test
> times :)

Thanks for the review.  Pushed to master and 14, with the wait event
moved to the end of the enum for the back-patch.

> > From 1eb0266fed7ccb63a2430e4fbbaef2300f2aa0d0 Mon Sep 17 00:00:00 2001
> > From: Thomas Munro <thomas.munro@gmail.com>
> > Date: Tue, 1 Mar 2022 11:38:27 +1300
> > Subject: [PATCH v2 2/3] Fix waiting in RegisterSyncRequest().
>
> LGTM.

Pushed as far back as 12.  It could be done for 10 & 11, but indeed
the code starts getting quite different back there, and since there
are no field reports, I think that's OK for now.

A simple repro, for the record: run installcheck with
shared_buffers=256kB, and then partway through, kill -STOP
$checkpointer to simulate being stalled on IO for a while.  Backends
will soon start waiting for the checkpointer to drain the queue while
dropping relations.  This state was invisible to pg_stat_activity, and
hangs forever if you kill the postmaster and CONT the checkpointer.

> > From 50060e5a0ed66762680ddee9e30acbad905c6e98 Mon Sep 17 00:00:00 2001
> > From: Thomas Munro <thomas.munro@gmail.com>
> > Date: Tue, 1 Mar 2022 17:34:43 +1300
> > Subject: [PATCH v2 3/3] Use condition variable to wait when sync request queue
> >  is full.

> [review]

I'll come back to 0003 (condition variable-based improvement) a bit later.



pgsql-hackers by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical replication timeout problem
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: BufferAlloc: don't take two simultaneous locks