Re: O(n) tasks cause lengthy startups and checkpoints - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: O(n) tasks cause lengthy startups and checkpoints
Date
Msg-id d4c7e393-b8be-4879-ad4b-4e80994e5763@www.fastmail.com
Whole thread Raw
In response to Re: O(n) tasks cause lengthy startups and checkpoints  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: O(n) tasks cause lengthy startups and checkpoints
List pgsql-hackers
On Wed, Dec 1, 2021, at 9:19 PM, Bossart, Nathan wrote:
On 12/1/21, 2:56 PM, "Andres Freund" <andres@anarazel.de> wrote:
> On 2021-12-01 20:24:25 +0000, Bossart, Nathan wrote:
>> I realize adding a new maintenance worker might be a bit heavy-handed,
>> but I think it would be nice to have somewhere to offload tasks that
>> really shouldn't impact startup and checkpointing.  I imagine such a
>> process would come in handy down the road, too.  WDYT?
>
> -1. I think the overhead of an additional worker is disproportional here. And
> there's simplicity benefits in having a predictable cleanup interlock as well.

Another idea I had was to put some upper limit on how much time is
spent on such tasks.  For example, a checkpoint would only spend X
minutes on CheckPointSnapBuild() before giving up until the next one.
I think the main downside of that approach is that it could lead to
unbounded growth, so perhaps we would limit (or even skip) such tasks
only for end-of-recovery and shutdown checkpoints.  Perhaps the
startup tasks could be limited in a similar fashion.
Saying that a certain task is O(n) doesn't mean it needs a separate process to
handle it. Did you have a use case or even better numbers (% of checkpoint /
startup time) that makes your proposal worthwhile?

I would try to optimize (1) and (2). However, delayed removal can be a
long-term issue if the new routine cannot keep up with the pace of file
creation (specially if the checkpoints are far apart).

For (3), there is already a GUC that would avoid the slowdown during startup.
Use it if you think the startup time is more important that disk space occupied
by useless files.

For (4), you are forgetting that the on-disk state of replication slots is
stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the
replication slot directory and copy the state file. What happen if there is a
crash before copying the state file?

While we are talking about items (1), (2) and (4), we could probably have an
option to create some ephemeral logical decoding files into ramdisk (similar to
statistics directory). I wouldn't like to hijack this thread but this proposal
could alleviate the possible issues that you pointed out. If people are
interested in this proposal, I can start a new thread about it.


--
Euler Taveira

pgsql-hackers by date:

Previous
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Optionally automatically disable logical replication subscriptions on error
Next
From: Amit Kapila
Date:
Subject: Re: Data is copied twice when specifying both child and parent table in publication