Re: Checkpoint throttling issues - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Checkpoint throttling issues
Date
Msg-id CA+TgmobZzOkSALE6sHNqO9hrL8Hj=u7VQFx3OpWC8n9zLWBgfg@mail.gmail.com
Whole thread Raw
In response to Checkpoint throttling issues  (Andres Freund <andres@anarazel.de>)
Responses Re: Checkpoint throttling issues  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Oct 19, 2015 at 6:10 AM, Andres Freund <andres@anarazel.de> wrote:
> 1) The progress passed to CheckpointWriteDelay() will often be wrong -
>    it's calculated as num_written / num_to_write, but num_written is only
>    incremented if the buffer hasn't since independently been written
>    out. That's bad because it mean's we'll think we're further and
>    further behind if there's independent writeout activity.
>
>    Simple enough to fix, we gotta split num_written into num_written
>    (for stats purposes) and num_processed (for progress).
>
>    This is pretty much a bug, but I'm a slightly worried about
>    backpatching a fix because it can have a rather noticeable
>    behavioural impact.

I think this is an algorithmic improvement, not a bug fix.  Actually,
I don't really think any of these things are bugs, properly considered
- they all look pretty intentional to me, even if we no longer agree
with the reasoning.  Maybe some of them could be back-patched anyway,
but at any rate I definitely wouldn't backpatch this or #3, because
even though changing this is probably better on the average, it's hard
to be sure that it won't be worse for somebody.  In the back-branches,
I think stability takes priority over improvements.

>    I think the sleep time should be computed adaptively based on the
>    number of buffers remaining and the remaining time. There's probably
>    better formulations, but that seems like an easy enough improvement
>    and considerably better than now.

One thing to keep in mind here is that somebody did work a few years
ago to reduce the number of wake-ups per second that PostgreSQL
generates when idle.  Now obviously getting the checkpointing behavior
correct is more important, and obviously also the system is not idle
if we're checkpointing, but it's something to keep in mind.  I like
the idea of an adaptive sleep time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: Patch (2): Implement failover on libpq connect level.
Next
From: Robert Haas
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.