Re: O(n) tasks cause lengthy startups and checkpoints - Mailing list pgsql-hackers

From Robert Haas
Subject Re: O(n) tasks cause lengthy startups and checkpoints
Date
Msg-id CA+Tgmoag+stJPU9Vxoyaq6gzZGPLQ9C+edZcKb4PgW-yTB8LaA@mail.gmail.com
Whole thread Raw
In response to Re: O(n) tasks cause lengthy startups and checkpoints  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: O(n) tasks cause lengthy startups and checkpoints  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
On Fri, Dec 10, 2021 at 2:03 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> Well, I haven't had a chance to look at your patch, and my patch set
> still only has handling for CheckPointSnapBuild() and
> RemovePgTempFiles(), but I thought I'd share what I have anyway.  I
> split it into 5 patches:
>
> 0001 - Adds a new "custodian" auxiliary process that does nothing.
> 0002 - During startup, remove the pgsql_tmp directories instead of
>        only clearing the contents.
> 0003 - Split temporary file cleanup during startup into two stages.
>        The first renames the directories, and the second clears them.
> 0004 - Moves the second stage from 0003 to the custodian process.
> 0005 - Moves CheckPointSnapBuild() to the custodian process.

I don't know whether this kind of idea is good or not.

One thing we've seen a number of times now is that entrusting the same
process with multiple responsibilities often ends poorly. Sometimes
it's busy with one thing when another thing really needs to be done
RIGHT NOW. Perhaps that won't be an issue here since all of these
things are related to checkpointing, but then the process name should
reflect that rather than making it sound like we can just keep piling
more responsibilities onto this process indefinitely. At some point
that seems bound to become an issue.

Another issue is that we don't want to increase the number of
processes without bound. Processes use memory and CPU resources and if
we run too many of them it becomes a burden on the system. Low-end
systems may not have too many resources in total, and high-end systems
can struggle to fit demanding workloads within the resources that they
have. Maybe it would be cheaper to do more things at once if we were
using threads rather than processes, but that day still seems fairly
far off.

But against all that, if these tasks are slowing down checkpoints and
that's avoidable, that seems pretty important too. Interestingly, I
can't say that I've ever seen any of these things be a problem for
checkpoint or startup speed. I wonder why you've had a different
experience.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dag Lem
Date:
Subject: Re: daitch_mokotoff module
Next
From: Robert Haas
Date:
Subject: Re: should we document an example to set multiple libraries in shared_preload_libraries?