Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations - Mailing list pgsql-hackers

From Jonah H. Harris
Subject Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations
Date
Msg-id CADUqk8X0QnU=g2ic7JGQcG1EE_J2c+Y2pTm0NcSPkY+D3zQm+Q@mail.gmail.com
Whole thread Raw
In response to Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Fri, Feb 17, 2023 at 12:03 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 17 Feb 2023 at 17:40, Jonah H. Harris <jonah.harris@gmail.com> wrote:
> Yeah. There’s definitely a smarter and more reusable approach than I was proposing. A lot of that code is fairly mature and I figured more people wouldn’t want to alter it in such ways - but I’m up for it if an approach like this is the direction we’d want to go in.

Having something agnostic to if it's allocating a new context or
adding a block to an existing one seems like a good idea to me.

I like this idea.
 
I think the tricky part will be the discussion around which size
classes to keep around and in which cases can we use a larger
allocation without worrying too much that it'll be wasted. We also
don't really want to make the minimum memory that a backend can keep
around too bad. Patches such as [1] are trying to reduce that.  Maybe
we can just keep a handful of blocks of 1KB, 8KB and 16KB around, or
more accurately put, ALLOCSET_SMALL_INITSIZE,
ALLOCSET_DEFAULT_INITSIZE and ALLOCSET_DEFAULT_INITSIZE * 2, so that
it works correctly if someone adjusts those definitions.

Per that patch and the general idea, what do you think of either:

1. A single GUC, something like backend_keep_mem, that represents the cached memory we'd retain rather than send directly to free()?
2. Multiple GUCs, one per block size?

While #2 would give more granularity, I'm not sure it would necessarily be needed. The main issue I'd see in that case would be the selection approach to block sizes to keep given a fixed amount of keep memory. We'd generally want the majority of the next queries to make use of it as best as possible, so we'd either need each size to be equally represented or some heuristic.

I don't really like #2, but threw it out there :)

I think you'll want to look at what the maximum memory a backend can
keep around in context_freelists[] and not make the worst-case memory
consumption worse than it is today.

Agreed.
 
I imagine this would be some new .c file in src/backend/utils/mmgr
which aset.c, generation.c and slab.c each call a function from to see
if we have any cached blocks of that size.  You'd want to call that in
all places we call malloc() from those files apart from when aset.c
and generation.c malloc() for a dedicated block.  You can probably get
away with replacing all of the free() calls with a call to another
function where you pass the pointer and the size of the block to have
it decide if it's going to free() it or cache it.

Agreed. I would see this as practically just a generic allocator free-list; is that how you view it also?
 
I doubt you need to care too much if the block is from a dedicated allocation or a normal
block.  We'd just always free() if it's not in the size classes that
we care about.

Agreed.

--
Jonah H. Harris

pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: pg_init_privs corruption.
Next
From: "David G. Johnston"
Date:
Subject: Re: psql: Add role's membership options to the \du+ command