Re: Separate memory contexts for relcache and catcache - Mailing list pgsql-hackers

From David Rowley
Subject Re: Separate memory contexts for relcache and catcache
Date
Msg-id CAApHDvpe8KhrCfgba1aXWemswgpK5dg7Za=wMh8hvotuGfyC0Q@mail.gmail.com
Whole thread Raw
In response to Re: Separate memory contexts for relcache and catcache  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Thu, 10 Aug 2023 at 01:23, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Aug-09, Melih Mutlu wrote:
>
> > --Patch
> >          name          | used_bytes | free_bytes | total_bytes
> > -----------------------+------------+------------+-------------
> >  RelCacheMemoryContext |    4706464 |    3682144 |     8388608
> >  CatCacheMemoryContext |    3489384 |     770712 |     4260096
> >  index info            |    2102160 |     113776 |     2215936
> >  CacheMemoryContext    |       2336 |       5856 |        8192
> >  relation rules        |       4416 |       3776 |        8192
> > (5 rows)
>
> Hmm, is this saying that there's too much fragmentation in the relcache
> context?

free_bytes is just the space in the blocks that are not being used by
any allocated chunks or chunks on the freelist.

It looks like RelCacheMemoryContext has 10 blocks including the 8kb
initial block:

postgres=# select 8192 + sum(8192*power(2,x)) as total_bytes from
generate_series(0,9) x;
 total_bytes
-------------
     8388608

The first 2 blocks are 8KB as we only start doubling after we malloc
the first 8kb block after the keeper block.

If there was 1 fewer block then total_bytes would be 4194304, which is
less than the used_bytes for that context, so those 10 block look
needed.

> Maybe it would improve things to make it a SlabContext instead
> of AllocSet.  Or, more precisely, a bunch of SlabContexts, each with the
> appropriate chunkSize for the object being stored.

It would at least save from having to do the power of 2 rounding that
aset does. However, on a quick glance, it seems not all the size
requests in relcache.c are fixed.  I see a datumCopy() in
RelationBuildTupleDesc() for the attmissingval stuff, so we couldn't
SlabAlloc that.

It could be worth looking at the size classes of the fixed-sized
allocations to estimate how much memory we might save by using slab to
avoid the power-2 rounding that aset.c does. However, if there are too
many contexts then we may end up using more memory with all the
mostly-empty contexts for backends that only query a tiny number of
tables.  That might not be good.  Slab also does not do block doubling
like aset does, so it might be hard to choose a good block size.

> (I don't say this
> because I know for a fact that Slab is better for these purposes; it's
> just that I happened to read its comments yesterday and they stated that
> it behaves better in terms of fragmentation.  Maybe Andres or Tomas have
> an opinion on this.)

I'm not sure of the exact comment, but I was in the recently and
there's a chance that I wrote that comment.  Slab priorities putting
new chunks on fuller blocks and may free() blocks once they become
empty of any chunks.  Aset does no free()ing of blocks unless a block
was malloc()ed especially for a chunk above allocChunkLimit.  That
means aset might hold a lot of malloc'ed memory for chunks that just
sit on freelists which might never be used ever again, meanwhile,
other request sizes may have to malloc new blocks.

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Remove distprep
Next
From: Ashutosh Bapat
Date:
Subject: Re: Oversight in reparameterize_path_by_child leading to executor crash