Thread: Separate memory contexts for relcache and catcache

Separate memory contexts for relcache and catcache

From
Melih Mutlu
Date:
Hi hackers,

Most catcache and relcache entries (other than index info etc.) currently go straight into CacheMemoryContext. And I believe these two caches can be the ones with the largest contribution to the memory usage of CacheMemoryContext most of the time. For example, in cases where we have lots of database objects accessed in a long-lived connection, CacheMemoryContext tends to increase significantly.

While I've been working on another patch for pg_backend_memory_contexts view, we thought that it would also be better to see the memory usages of different kinds of caches broken down into their own contexts. The attached patch implements this and aims to easily keep track of the memory used by relcache and catcache

To quickly show how pg_backend_memory_contexts would look like, I did the following:

-Create some tables:
SELECT 'BEGIN;' UNION ALL SELECT format('CREATE TABLE %1$s(id serial primary key, data text not null unique)', 'test_'||g.i) FROM generate_series(0, 1000) g(i) UNION ALL SELECT 'COMMIT;';\gexec

-Open a new connection and query pg_backend_memory_contexts [1]:
This is what you'll see before and after the patch.
-- HEAD:
        name        | used_bytes | free_bytes | total_bytes
--------------------+------------+------------+-------------
 CacheMemoryContext |     467656 |      56632 |      524288
 index info         |     111760 |      46960 |      158720
 relation rules     |       4416 |       3776 |        8192

(3 rows)

-- Patch:
         name          | used_bytes | free_bytes | total_bytes
-----------------------+------------+------------+-------------
 CatCacheMemoryContext |     217696 |      44448 |      262144
 RelCacheMemoryContext |     248264 |      13880 |      262144
 index info            |     111760 |      46960 |      158720
 CacheMemoryContext    |       2336 |       5856 |        8192
 relation rules        |       4416 |       3776 |        8192
(5 rows)


- Run select on all tables
SELECT format('SELECT count(*) FROM %1$s', 'test_'||g.i) FROM generate_series(0, 1000) g(i);\gexec

- Then check pg_backend_memory_contexts [1] again: 
--HEAD
        name        | used_bytes | free_bytes | total_bytes
--------------------+------------+------------+-------------
 CacheMemoryContext |    8197344 |     257056 |     8454400
 index info         |    2102160 |     113776 |     2215936
 relation rules     |       4416 |       3776 |        8192

(3 rows)

--Patch
         name          | used_bytes | free_bytes | total_bytes
-----------------------+------------+------------+-------------
 RelCacheMemoryContext |    4706464 |    3682144 |     8388608
 CatCacheMemoryContext |    3489384 |     770712 |     4260096
 index info            |    2102160 |     113776 |     2215936
 CacheMemoryContext    |       2336 |       5856 |        8192
 relation rules        |       4416 |       3776 |        8192
(5 rows)


You can see that CacheMemoryContext does not use much memory without catcache and relcache (at least in cases similar to above), and it's easy to bloat catcache and relcache. That's why I think it would be useful to see their usage separately. 

Any feedback would be appreciated.

[1] 
SELECT
name,
sum(used_bytes) AS used_bytes,
sum(free_bytes) AS free_bytes,
sum(total_bytes) AS total_bytes
FROM pg_backend_memory_contexts
WHERE name LIKE '%CacheMemoryContext%' OR parent LIKE '%CacheMemoryContext%'
GROUP BY name
ORDER BY total_bytes DESC;


Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Separate memory contexts for relcache and catcache

From
Andy Fan
Date:


Most catcache and relcache entries (other than index info etc.) currently go straight into CacheMemoryContext. And I believe these two caches can be the ones with the largest contribution to the memory usage of CacheMemoryContext most of the time. For example, in cases where we have lots of database objects accessed in a long-lived connection, CacheMemoryContext tends to increase significantly.

While I've been working on another patch for pg_backend_memory_contexts view, we thought that it would also be better to see the memory usages of different kinds of caches broken down into their own contexts. The attached patch implements this and aims to easily keep track of the memory used by relcache and catcache


+ 1 for the idea, this would be pretty useful as a proof of which
context is consuming most of the memory and it doesn't cost
much.  It would be handy than estimating that by something 
like select count(*) from pg_class. 

I think, for example,  if we find relcache using too much memory,
it is a signal that the user may use too many partitioned tables. 


--
Best Regards
Andy Fan

Re: Separate memory contexts for relcache and catcache

From
Alvaro Herrera
Date:
On 2023-Aug-09, Melih Mutlu wrote:

> --Patch
>          name          | used_bytes | free_bytes | total_bytes
> -----------------------+------------+------------+-------------
>  RelCacheMemoryContext |    4706464 |    3682144 |     8388608
>  CatCacheMemoryContext |    3489384 |     770712 |     4260096
>  index info            |    2102160 |     113776 |     2215936
>  CacheMemoryContext    |       2336 |       5856 |        8192
>  relation rules        |       4416 |       3776 |        8192
> (5 rows)

Hmm, is this saying that there's too much fragmentation in the relcache
context?  Maybe it would improve things to make it a SlabContext instead
of AllocSet.  Or, more precisely, a bunch of SlabContexts, each with the
appropriate chunkSize for the object being stored.  (I don't say this
because I know for a fact that Slab is better for these purposes; it's
just that I happened to read its comments yesterday and they stated that
it behaves better in terms of fragmentation.  Maybe Andres or Tomas have
an opinion on this.)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)



Re: Separate memory contexts for relcache and catcache

From
David Rowley
Date:
On Thu, 10 Aug 2023 at 01:23, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Aug-09, Melih Mutlu wrote:
>
> > --Patch
> >          name          | used_bytes | free_bytes | total_bytes
> > -----------------------+------------+------------+-------------
> >  RelCacheMemoryContext |    4706464 |    3682144 |     8388608
> >  CatCacheMemoryContext |    3489384 |     770712 |     4260096
> >  index info            |    2102160 |     113776 |     2215936
> >  CacheMemoryContext    |       2336 |       5856 |        8192
> >  relation rules        |       4416 |       3776 |        8192
> > (5 rows)
>
> Hmm, is this saying that there's too much fragmentation in the relcache
> context?

free_bytes is just the space in the blocks that are not being used by
any allocated chunks or chunks on the freelist.

It looks like RelCacheMemoryContext has 10 blocks including the 8kb
initial block:

postgres=# select 8192 + sum(8192*power(2,x)) as total_bytes from
generate_series(0,9) x;
 total_bytes
-------------
     8388608

The first 2 blocks are 8KB as we only start doubling after we malloc
the first 8kb block after the keeper block.

If there was 1 fewer block then total_bytes would be 4194304, which is
less than the used_bytes for that context, so those 10 block look
needed.

> Maybe it would improve things to make it a SlabContext instead
> of AllocSet.  Or, more precisely, a bunch of SlabContexts, each with the
> appropriate chunkSize for the object being stored.

It would at least save from having to do the power of 2 rounding that
aset does. However, on a quick glance, it seems not all the size
requests in relcache.c are fixed.  I see a datumCopy() in
RelationBuildTupleDesc() for the attmissingval stuff, so we couldn't
SlabAlloc that.

It could be worth looking at the size classes of the fixed-sized
allocations to estimate how much memory we might save by using slab to
avoid the power-2 rounding that aset.c does. However, if there are too
many contexts then we may end up using more memory with all the
mostly-empty contexts for backends that only query a tiny number of
tables.  That might not be good.  Slab also does not do block doubling
like aset does, so it might be hard to choose a good block size.

> (I don't say this
> because I know for a fact that Slab is better for these purposes; it's
> just that I happened to read its comments yesterday and they stated that
> it behaves better in terms of fragmentation.  Maybe Andres or Tomas have
> an opinion on this.)

I'm not sure of the exact comment, but I was in the recently and
there's a chance that I wrote that comment.  Slab priorities putting
new chunks on fuller blocks and may free() blocks once they become
empty of any chunks.  Aset does no free()ing of blocks unless a block
was malloc()ed especially for a chunk above allocChunkLimit.  That
means aset might hold a lot of malloc'ed memory for chunks that just
sit on freelists which might never be used ever again, meanwhile,
other request sizes may have to malloc new blocks.

David



Re: Separate memory contexts for relcache and catcache

From
Andres Freund
Date:
Hi,

On 2023-08-09 15:02:31 +0300, Melih Mutlu wrote:
> To quickly show how pg_backend_memory_contexts would look like, I did the
> following:
> 
> -Create some tables:
> SELECT 'BEGIN;' UNION ALL SELECT format('CREATE TABLE %1$s(id serial
> primary key, data text not null unique)', 'test_'||g.i) FROM
> generate_series(0, 1000) g(i) UNION ALL SELECT 'COMMIT;';\gexec
> 
> -Open a new connection and query pg_backend_memory_contexts [1]:
> This is what you'll see before and after the patch.
> -- HEAD:
>         name        | used_bytes | free_bytes | total_bytes
> --------------------+------------+------------+-------------
>  CacheMemoryContext |     467656 |      56632 |      524288
>  index info         |     111760 |      46960 |      158720
>  relation rules     |       4416 |       3776 |        8192
> (3 rows)
> 
> -- Patch:
>          name          | used_bytes | free_bytes | total_bytes
> -----------------------+------------+------------+-------------
>  CatCacheMemoryContext |     217696 |      44448 |      262144
>  RelCacheMemoryContext |     248264 |      13880 |      262144
>  index info            |     111760 |      46960 |      158720
>  CacheMemoryContext    |       2336 |       5856 |        8192
>  relation rules        |       4416 |       3776 |        8192
> (5 rows)

Have you checked what the source of the remaining allocations in
CacheMemoryContext are?


One thing that I had observed previously and reproduced with this patch, is
that the first backend starting after a restart uses considerably more memory:

first:
┌───────────────────────┬────────────┬────────────┬─────────────┐
│         name          │ used_bytes │ free_bytes │ total_bytes │
├───────────────────────┼────────────┼────────────┼─────────────┤
│ CatCacheMemoryContext │     370112 │     154176 │      524288 │
│ RelCacheMemoryContext │     244136 │      18008 │      262144 │
│ index info            │     104392 │      45112 │      149504 │
│ CacheMemoryContext    │       2304 │       5888 │        8192 │
│ relation rules        │       3856 │        240 │        4096 │
└───────────────────────┴────────────┴────────────┴─────────────┘

second:
┌───────────────────────┬────────────┬────────────┬─────────────┐
│         name          │ used_bytes │ free_bytes │ total_bytes │
├───────────────────────┼────────────┼────────────┼─────────────┤
│ CatCacheMemoryContext │     215072 │      47072 │      262144 │
│ RelCacheMemoryContext │     243856 │      18288 │      262144 │
│ index info            │     104944 │      47632 │      152576 │
│ CacheMemoryContext    │       2304 │       5888 │        8192 │
│ relation rules        │       3856 │        240 │        4096 │
└───────────────────────┴────────────┴────────────┴─────────────┘

This isn't caused by this patch, but it does make it easier to pinpoint than
before.  The reason is fairly simple: On the first start we start without
being able to use relcache init files, in later starts we can. The reason the
size increase is in CatCacheMemoryContext, rather than RelCacheMemoryContext,
is simple: When using the init file the catcache isn't used, when not, we have
to query the catcache a lot to build the initial relcache contents.


Given the size of both CatCacheMemoryContext and RelCacheMemoryContext in a
new backend, I think it might be worth using non-default aset parameters. A
bit ridiculous to increase block sizes from 8k upwards in every single
connection made to postgres ever.


> - Run select on all tables
> SELECT format('SELECT count(*) FROM %1$s', 'test_'||g.i) FROM
> generate_series(0, 1000) g(i);\gexec
> 
> - Then check pg_backend_memory_contexts [1] again:
> --HEAD
>         name        | used_bytes | free_bytes | total_bytes
> --------------------+------------+------------+-------------
>  CacheMemoryContext |    8197344 |     257056 |     8454400
>  index info         |    2102160 |     113776 |     2215936
>  relation rules     |       4416 |       3776 |        8192
> (3 rows)
> 
> --Patch
>          name          | used_bytes | free_bytes | total_bytes
> -----------------------+------------+------------+-------------
>  RelCacheMemoryContext |    4706464 |    3682144 |     8388608
>  CatCacheMemoryContext |    3489384 |     770712 |     4260096
>  index info            |    2102160 |     113776 |     2215936
>  CacheMemoryContext    |       2336 |       5856 |        8192
>  relation rules        |       4416 |       3776 |        8192
> (5 rows)
> 
> You can see that CacheMemoryContext does not use much memory without
> catcache and relcache (at least in cases similar to above), and it's easy
> to bloat catcache and relcache. That's why I think it would be useful to
> see their usage separately.

Yes, I think it'd be quite useful. There's ways to bloat particularly catcache
much further, and it's hard to differentiate that from other sources of bloat
right now.


> +static void
> +CreateCatCacheMemoryContext()

We typically use (void) to differentiate from an older way of function
declarations that didn't have argument types.


> +{
> +    if (!CacheMemoryContext)
> +        CreateCacheMemoryContext();

I wish we just made sure that cache memory context were created in the right
place, instead of spreading this check everywhere...


> @@ -3995,9 +3998,9 @@ RelationCacheInitializePhase2(void)
>          return;
>  
>      /*
> -     * switch to cache memory context
> +     * switch to relcache memory context
>       */
> -    oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
> +    oldcxt = MemoryContextSwitchTo(RelCacheMemoryContext);
>  
>      /*
>       * Try to load the shared relcache cache file.  If unsuccessful, bootstrap
> @@ -4050,9 +4053,9 @@ RelationCacheInitializePhase3(void)
>      RelationMapInitializePhase3();
>  
>      /*
> -     * switch to cache memory context
> +     * switch to relcache memory context
>       */
> -    oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
> +    oldcxt = MemoryContextSwitchTo(RelCacheMemoryContext);
>  
>      /*
>       * Try to load the local relcache cache file.  If unsuccessful, bootstrap

I'd just delete these comments, they're just pointlessly restating the code.


Greetings,

Andres Freund



Re: Separate memory contexts for relcache and catcache

From
torikoshia
Date:
Hi,

I also think this change would be helpful.

I imagine you're working on the Andres's comments and you already notice 
this, but v1 patch cannot be applied to HEAD.
For the convenience of other reviewers, I marked it 'Waiting on Author'.

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



Re: Separate memory contexts for relcache and catcache

From
Melih Mutlu
Date:
Hi,

torikoshia <torikoshia@oss.nttdata.com>, 4 Ara 2023 Pzt, 07:59 tarihinde şunu yazdı:
Hi,

I also think this change would be helpful.

I imagine you're working on the Andres's comments and you already notice
this, but v1 patch cannot be applied to HEAD.
For the convenience of other reviewers, I marked it 'Waiting on Author'.

Thanks for letting me know. I rebased the patch. PFA new version.




Andres Freund <andres@anarazel.de>, 12 Eki 2023 Per, 20:01 tarihinde şunu yazdı:
Hi,

Have you checked what the source of the remaining allocations in
CacheMemoryContext are?

It's mostly typecache, around 2K. Do you think typecache also needs a separate context?

Given the size of both CatCacheMemoryContext and RelCacheMemoryContext in a
new backend, I think it might be worth using non-default aset parameters. A
bit ridiculous to increase block sizes from 8k upwards in every single
connection made to postgres ever.

Considering it starts from ~262K, what would be better for init size? 256K?  

> +static void
> +CreateCatCacheMemoryContext()

We typically use (void) to differentiate from an older way of function
declarations that didn't have argument types.
Done. 

> +{
> +     if (!CacheMemoryContext)
> +             CreateCacheMemoryContext();

I wish we just made sure that cache memory context were created in the right
place, instead of spreading this check everywhere...

That would be nice. Do you have a suggestion about where that right place would be?

I'd just delete these comments, they're just pointlessly restating the code.

Done.

Thanks,
--
Melih Mutlu
Microsoft
Attachment

Re: Separate memory contexts for relcache and catcache

From
vignesh C
Date:
On Wed, 3 Jan 2024 at 16:56, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi,
>
> torikoshia <torikoshia@oss.nttdata.com>, 4 Ara 2023 Pzt, 07:59 tarihinde şunu yazdı:
>>
>> Hi,
>>
>> I also think this change would be helpful.
>>
>> I imagine you're working on the Andres's comments and you already notice
>> this, but v1 patch cannot be applied to HEAD.
>> For the convenience of other reviewers, I marked it 'Waiting on Author'.
>
>
> Thanks for letting me know. I rebased the patch. PFA new version.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
729439607ad210dbb446e31754e8627d7e3f7dda ===
=== applying patch
./v2-0001-Separate-memory-contexts-for-relcache-and-catcach.patch
patching file src/backend/utils/cache/catcache.c
...
Hunk #8 FAILED at 1933.
Hunk #9 succeeded at 2253 (offset 84 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/utils/cache/catcache.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4554.log

Regards,
Vignesh



Re: Separate memory contexts for relcache and catcache

From
Melih Mutlu
Date:


vignesh C <vignesh21@gmail.com>, 27 Oca 2024 Cmt, 06:01 tarihinde şunu yazdı:
On Wed, 3 Jan 2024 at 16:56, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
729439607ad210dbb446e31754e8627d7e3f7dda ===
=== applying patch
./v2-0001-Separate-memory-contexts-for-relcache-and-catcach.patch
patching file src/backend/utils/cache/catcache.c
...
Hunk #8 FAILED at 1933.
Hunk #9 succeeded at 2253 (offset 84 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/utils/cache/catcache.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4554.log

Regards,
Vignesh

Rebased. PSA.


--
Melih Mutlu
Microsoft
Attachment

Re: Separate memory contexts for relcache and catcache

From
Jeff Davis
Date:
On Wed, 2024-04-03 at 16:12 +0300, Melih Mutlu wrote:
> Rebased. PSA.

Thank you. I missed your patch and came up with a similar patch over
here:

https://www.postgresql.org/message-id/flat/78599c442380ddb5990117e281a4fa65a74231af.camel@j-davis.com

I closed my thread and we can continue this one.

One difference is that I tried to capture almost all uses of
CacheMemoryContext so that it would become just a parent context
without many allocations of its own.

The plan cache and SPI caches can be important, too. Or, one of the
other caches that we expect to be small might grow in some edge cases
(or due to a bug), and it would be good to be able to see that.

I agree with others that we should look at changing the initial size or
type of the contexts, but that should be a separate commit.

Regards,
    Jeff Davis




Re: Separate memory contexts for relcache and catcache

From
Andres Freund
Date:
Hi,

On 2024-10-29 15:00:02 -0700, Jeff Davis wrote:
> On Wed, 2024-04-03 at 16:12 +0300, Melih Mutlu wrote:
> > Rebased. PSA.
>
> Thank you. I missed your patch and came up with a similar patch over
> here:
>
> https://www.postgresql.org/message-id/flat/78599c442380ddb5990117e281a4fa65a74231af.camel@j-davis.com
>
> I closed my thread and we can continue this one.
>
> One difference is that I tried to capture almost all uses of
> CacheMemoryContext so that it would become just a parent context
> without many allocations of its own.

I'm a bit worried about the increase in "wasted" memory we might end up when
creating one aset for *everything*. Just splitting out Relcache and CatCache
isn't a big deal from that angle, they're always used reasonably much. But
creating a bunch of barely used contexts does have the potential for lots of
memory being wasted at the end of a page and on freelists.  It might be ok as
far was what you proposed in the above email, I haven't analyzed that in depth
yet.

> I agree with others that we should look at changing the initial size or
> type of the contexts, but that should be a separate commit.

It needs to be done close together though, otherwise we'll increase the
new-connection-memory-usage of postgres measurably.


I've previously proposed creating a type of memory context that's intended for
places where we never expect to allocate much which allocates from either a
superior memory context or just from the system allocator and tracks memory
via linked lists.  That'd allow us to use fairly granular memory contexts with
low overhead, which we e.g. could use to actually create each catcache &
relcache entry in its own context.

One concern that was voiced about that idea was that it'd perform badly if
such a context did end up being used hotly - I'm not sure that's a real
problem, but we could address it by switching to a different allocation scheme
once a certain size is reached.

Greetings,

Andres Freund



Re: Separate memory contexts for relcache and catcache

From
Jeff Davis
Date:
On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote:
> I'm a bit worried about the increase in "wasted" memory we might end
> up when
> creating one aset for *everything*. Just splitting out Relcache and
> CatCache
> isn't a big deal from that angle, they're always used reasonably
> much. But
> creating a bunch of barely used contexts does have the potential for
> lots of
> memory being wasted at the end of a page and on freelists.  It might
> be ok as
> far was what you proposed in the above email, I haven't analyzed that
> in depth
> yet.

Melih raised similar concerns. The new contexts that my patch created
were CatCacheContext, RelCacheContext, SPICacheContext,
PgOutputContext, PlanCacheContext, TextSearchCacheContext, and
TypCacheContext.

Those are all created lazily, so you need to at least be using the
relevant feature before it has any cost (with the exception of the
first two).

> > I agree with others that we should look at changing the initial
> > size or
> > type of the contexts, but that should be a separate commit.
>
> It needs to be done close together though, otherwise we'll increase
> the
> new-connection-memory-usage of postgres measurably.

I don't have a strong opinion here; that was a passing comment. But I'm
curious: why it would increase the per-connection memory usage much to
just have a couple new memory contexts?

> I've previously proposed creating a type of memory context that's
> intended for
> places where we never expect to allocate much which allocates from
> either a
> superior memory context or just from the system allocator and tracks
> memory
> via linked lists.

Why not just use ALLOCSET_SMALL_SIZES?


Regards,
    Jeff Davis




Re: Separate memory contexts for relcache and catcache

From
Andres Freund
Date:
Hi,

On 2024-11-01 14:47:37 -0700, Jeff Davis wrote:
> On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote:
> > I'm a bit worried about the increase in "wasted" memory we might end
> > up when
> > creating one aset for *everything*. Just splitting out Relcache and
> > CatCache
> > isn't a big deal from that angle, they're always used reasonably
> > much. But
> > creating a bunch of barely used contexts does have the potential for
> > lots of
> > memory being wasted at the end of a page and on freelists.  It might
> > be ok as
> > far was what you proposed in the above email, I haven't analyzed that
> > in depth
> > yet.
>
> Melih raised similar concerns. The new contexts that my patch created
> were CatCacheContext, RelCacheContext, SPICacheContext,
> PgOutputContext, PlanCacheContext, TextSearchCacheContext, and
> TypCacheContext.
>
> Those are all created lazily, so you need to at least be using the
> relevant feature before it has any cost (with the exception of the
> first two).

Well, you can't get very far without using at least CatCacheContext,
RelCacheContext, PlanCacheContext, TypCacheContext. The others are indeed much
more specific and not really worth worrying about.


> > > I agree with others that we should look at changing the initial
> > > size or
> > > type of the contexts, but that should be a separate commit.
> >
> > It needs to be done close together though, otherwise we'll increase
> > the
> > new-connection-memory-usage of postgres measurably.
>
> I don't have a strong opinion here; that was a passing comment. But I'm
> curious: why it would increase the per-connection memory usage much to
> just have a couple new memory contexts?

"much" is maybe too strong. But the memory usage in a new connection is fairly
low, it doesn't take a large increase to be noticeable percentage-wise. And
given how much people love having poolers full of idle connections, it shows
up in aggregate.


> > I've previously proposed creating a type of memory context that's
> > intended for
> > places where we never expect to allocate much which allocates from
> > either a
> > superior memory context or just from the system allocator and tracks
> > memory
> > via linked lists.
>
> Why not just use ALLOCSET_SMALL_SIZES?

That helps some, but not *that* much. You still end up with a bunch of partially
filled blocks. Here's e.g. an excerpt with your patch applied:

│             name             │                     ident                      │   type   │ level │     path      │
total_bytes│ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
 

├──────────────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼───────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│ CacheMemoryContext           │ (null)                                         │ AllocSet │     2 │ {1,19}        │
   8192 │             1 │       7952 │           0 │        240 │
 
│ TypCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,28}     │
   8192 │             1 │       4816 │           0 │       3376 │
 
│ search_path processing cache │ (null)                                         │ AllocSet │     3 │ {1,19,29}     │
   8192 │             1 │       5280 │           7 │       2912 │
 
│ CatCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,30}     │
 262144 │             6 │      14808 │           0 │     247336 │
 
│ RelCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,31}     │
 262144 │             6 │       8392 │           2 │     253752 │
 
│ relation rules               │ pg_backend_memory_contexts                     │ AllocSet │     4 │ {1,19,31,34}  │
   8192 │             4 │       3280 │           1 │       4912 │
 
│ index info                   │ manyrows_pkey                                  │ AllocSet │     4 │ {1,19,31,35}  │
   2048 │             2 │        864 │           1 │       1184 │
 
│ index info                   │ pg_statistic_ext_relid_index                   │ AllocSet │     4 │ {1,19,31,36}  │
   2048 │             2 │        928 │           1 │       1120 │
 
│ index info                   │ pg_class_tblspc_relfilenode_index              │ AllocSet │     4 │ {1,19,31,37}  │
   2048 │             2 │        440 │           1 │       1608 │
 

(this is a tiny bit misleading as "search_path processing cache" was just moved")

You can quickly see that the various contexts have a decent amount of free
space, some of their space.

We've already been more aggressive about using separate contets for indexes -
and in aggregate that memory usage shows up:

postgres[1088243][1]=# SELECT count(*), sum(total_bytes) as total_bytes, sum(total_nblocks) as total_nblocks,
sum(free_bytes)free_bytes, sum(free_chunks) as free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts
WHEREpath @> (SELECT path FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext') and name = 'index info'
 
┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│    87 │      162816 │           144 │      48736 │         120 │     114080 │
└───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘



And it's not just the partially filled blocks that are an "issue", it's also
the freelists that are much less likely to be used soon if they're split very
granularly. Often we'll end up with memory in freelists that are created while
building some information that then will not be used again.


Without your patch:

┌────────────────────┬────────────────────────────────────────────────┬──────────┬───────┬────────────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│        name        │                     ident                      │   type   │ level │    path    │ total_bytes │
total_nblocks│ free_bytes │ free_chunks │ used_bytes │
 

├────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│ CacheMemoryContext │ (null)                                         │ AllocSet │     2 │ {1,17}     │      524288 │
         7 │      75448 │           0 │     448840 │
 
│ relation rules     │ pg_backend_memory_contexts                     │ AllocSet │     3 │ {1,17,27}  │        8192 │
         4 │       3472 │           4 │       4720 │
 
...


Greetings,

Andres Freund



Re: Separate memory contexts for relcache and catcache

From
Ashutosh Bapat
Date:
On Sat, Nov 2, 2024 at 3:17 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote:
> > I'm a bit worried about the increase in "wasted" memory we might end
> > up when
> > creating one aset for *everything*. Just splitting out Relcache and
> > CatCache
> > isn't a big deal from that angle, they're always used reasonably
> > much. But
> > creating a bunch of barely used contexts does have the potential for
> > lots of
> > memory being wasted at the end of a page and on freelists.  It might
> > be ok as
> > far was what you proposed in the above email, I haven't analyzed that
> > in depth
> > yet.
>
> Melih raised similar concerns. The new contexts that my patch created
> were CatCacheContext, RelCacheContext, SPICacheContext,
> PgOutputContext, PlanCacheContext, TextSearchCacheContext, and
> TypCacheContext.
>
> Those are all created lazily, so you need to at least be using the
> relevant feature before it has any cost (with the exception of the
> first two).
>
> > > I agree with others that we should look at changing the initial
> > > size or
> > > type of the contexts, but that should be a separate commit.
> >
> > It needs to be done close together though, otherwise we'll increase
> > the
> > new-connection-memory-usage of postgres measurably.
>
> I don't have a strong opinion here; that was a passing comment. But I'm
> curious: why it would increase the per-connection memory usage much to
> just have a couple new memory contexts?


Without patch
First backend
SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes,
sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes))
free_bytes, sum(free_chunks) as free_chunks,
pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts();
 count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes
-------+-------------+---------------+------------+-------------+------------
   121 | 1917 kB     |           208 | 716 kB     |         128 | 1201 kB
(1 row)

Second backend
SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes,
sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes))
free_bytes, sum(free_chunks) as free_chunks,
pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts();
 count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes
-------+-------------+---------------+------------+-------------+------------
   121 | 1408 kB     |           210 | 384 kB     |         186 | 1024 kB
(1 row)

With both patches from Melih applied
First backend
SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes,
sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes))
free_bytes, sum(free_chunks) as free_chunks,
pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts();
 count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes
-------+-------------+---------------+------------+-------------+------------
   124 | 1670 kB     |           207 | 467 kB     |         128 | 1203 kB
(1 row)

Second backend
SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes,
sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes))
free_bytes, sum(free_chunks) as free_chunks,
pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts();
 count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes
-------+-------------+---------------+------------+-------------+------------
   124 | 1417 kB     |           209 | 391 kB     |         187 | 1026 kB
(1 row)

So it looks like the patches do reduce memory allocated at the start
of a backend. That is better as far as the conditions just after the
backend start are concerned.

The chunks of memory allocated in a given context will more likely
have similar sizes since they will be allocated for the same types of
objects as compared to one big context where chunks are allocated for
many different kinds of objects. I believe this will lead to a better
utilization of freelist.

--
Best Wishes,
Ashutosh Bapat



Re: Separate memory contexts for relcache and catcache

From
Ashutosh Bapat
Date:
On Sat, Nov 2, 2024 at 4:18 AM Andres Freund <andres@anarazel.de> wrote:
>
>
> > > I've previously proposed creating a type of memory context that's
> > > intended for
> > > places where we never expect to allocate much which allocates from
> > > either a
> > > superior memory context or just from the system allocator and tracks
> > > memory
> > > via linked lists.
> >
> > Why not just use ALLOCSET_SMALL_SIZES?
>
> That helps some, but not *that* much. You still end up with a bunch of partially
> filled blocks. Here's e.g. an excerpt with your patch applied:
>
> │             name             │                     ident                      │   type   │ level │     path      │
total_bytes│ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
 
>
├──────────────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼───────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
> │ CacheMemoryContext           │ (null)                                         │ AllocSet │     2 │ {1,19}        │
     8192 │             1 │       7952 │           0 │        240 │
 
> │ TypCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,28}     │
     8192 │             1 │       4816 │           0 │       3376 │
 
> │ search_path processing cache │ (null)                                         │ AllocSet │     3 │ {1,19,29}     │
     8192 │             1 │       5280 │           7 │       2912 │
 
> │ CatCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,30}     │
   262144 │             6 │      14808 │           0 │     247336 │
 
> │ RelCacheContext              │ (null)                                         │ AllocSet │     3 │ {1,19,31}     │
   262144 │             6 │       8392 │           2 │     253752 │
 
> │ relation rules               │ pg_backend_memory_contexts                     │ AllocSet │     4 │ {1,19,31,34}  │
     8192 │             4 │       3280 │           1 │       4912 │
 
> │ index info                   │ manyrows_pkey                                  │ AllocSet │     4 │ {1,19,31,35}  │
     2048 │             2 │        864 │           1 │       1184 │
 
> │ index info                   │ pg_statistic_ext_relid_index                   │ AllocSet │     4 │ {1,19,31,36}  │
     2048 │             2 │        928 │           1 │       1120 │
 
> │ index info                   │ pg_class_tblspc_relfilenode_index              │ AllocSet │     4 │ {1,19,31,37}  │
     2048 │             2 │        440 │           1 │       1608 │
 
>
> (this is a tiny bit misleading as "search_path processing cache" was just moved")
>
> You can quickly see that the various contexts have a decent amount of free
> space, some of their space.
>
> We've already been more aggressive about using separate contets for indexes -
> and in aggregate that memory usage shows up:
>
> postgres[1088243][1]=# SELECT count(*), sum(total_bytes) as total_bytes, sum(total_nblocks) as total_nblocks,
sum(free_bytes)free_bytes, sum(free_chunks) as free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts
WHEREpath @> (SELECT path FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext') and name = 'index info'
 
> ┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
> │ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
> ├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
> │    87 │      162816 │           144 │      48736 │         120 │     114080 │
> └───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘
>
>
>
> And it's not just the partially filled blocks that are an "issue", it's also
> the freelists that are much less likely to be used soon if they're split very
> granularly. Often we'll end up with memory in freelists that are created while
> building some information that then will not be used again.
>
>
> Without your patch:
>
┌────────────────────┬────────────────────────────────────────────────┬──────────┬───────┬────────────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
> │        name        │                     ident                      │   type   │ level │    path    │ total_bytes │
total_nblocks│ free_bytes │ free_chunks │ used_bytes │
 
>
├────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
> │ CacheMemoryContext │ (null)                                         │ AllocSet │     2 │ {1,17}     │      524288 │
           7 │      75448 │           0 │     448840 │
 
> │ relation rules     │ pg_backend_memory_contexts                     │ AllocSet │     3 │ {1,17,27}  │        8192 │
           4 │       3472 │           4 │       4720 │
 
> ...

If these caches are not used at all, this might be a problem. But I
think the applications which use TextSearchCacheContext, let's say,
are likely to use it so frequently that the free chunks will be
recycled. So, I don't know whether that will be a huge problem with
partial blocks and freelists.

However, we agree that it's generally good to have (at least some)
specific contexts as children of cache memory context. It will be good
to move ahead with the ones we all agree for now. Looking at all the
emails, those will be CatCacheContext,
RelCacheContext, PlanCacheContext, TypCacheContext. If we go with
fewer context, it will be good not to lose the work Jeff did for other
contexts though. I like those Create*CacheContext() functions. They
identify various specific uses of CacheMemoryContext. In future, if we
think that we need specific contexts for some of those, these will be
the functions where we will create specific contexts. We might need to
change the name of those functions to Get*CacheContext() instead of
Create since they won't create a context right now.

-- 
Best Wishes,
Ashutosh Bapat

Re: Separate memory contexts for relcache and catcache

From
Jeff Davis
Date:
On Mon, 2024-11-11 at 17:05 +0530, Ashutosh Bapat wrote:
> It will be good
> to move ahead with the ones we all agree for now. Looking at all the
> emails, those will be CatCacheContext,
> RelCacheContext, PlanCacheContext, TypCacheContext.

I'm not sure we have consensus on all of those yet. Andres's concern,
IIUC, is that the additional memory contexts will cause additional
fragmentation.

I believe we have a rough consensus that CatCacheContext and
RelCacheContext are wanted, but we're trying to find ways to mitigate
the fragmentation.

Regards,
    Jeff Davis




Re: Separate memory contexts for relcache and catcache

From
Ashutosh Bapat
Date:
On Tue, Nov 12, 2024 at 2:57 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2024-11-11 at 17:05 +0530, Ashutosh Bapat wrote:
> > It will be good
> > to move ahead with the ones we all agree for now. Looking at all the
> > emails, those will be CatCacheContext,
> > RelCacheContext, PlanCacheContext, TypCacheContext.
>
> I'm not sure we have consensus on all of those yet. Andres's concern,
> IIUC, is that the additional memory contexts will cause additional
> fragmentation.
>
> I believe we have a rough consensus that CatCacheContext and
> RelCacheContext are wanted, but we're trying to find ways to mitigate
> the fragmentation.

The totals (free_bytes, total_bytes, used_bytes) of memory contexts
separated from CacheMemoryContext and those without separate are
(35968, 540672, 504704) vs (75448,524288,448840). There's about 20K
increased in used_bytes and total_bytes. And we guess/know that that
increase is because of fragmentation. Am I right? But I don't find any
reference to what load Andres ran which resulted in this state [1]. So
can not make a judgement of whether that increase represents a typical
case or not.

I experimented with the plan cache context. I created 1000 tables
using Melih's [2] queries. But moved them into a single partitioned
table.
With no prepared statement
#SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as
total_bytes, sum(total_nblocks) as total_nblocks,
pg_size_pretty(sum(free_bytes)) free_by
tes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes))
used_bytes from pg_get_backend_memory_contexts() where name like
'CachedPlan%' or name = 'PlanCacheContext' group by name;
       name       | count | total_bytes | total_nblocks | free_bytes |
free_chunks | used_bytes
------------------+-------+-------------+---------------+------------+-------------+------------
 PlanCacheContext |     1 | 8192 bytes  |             1 | 7952 bytes |
          0 | 240 bytes
(1 row)

With 10 prepared statement each selecting from the partitioned table
#SELECT format('prepare all_tables_%s as SELECT count(*) FROM test',
g.i) from generate_series(1, 10) g(i); \gexec

#SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as
total_bytes, sum(total_nblocks) as total_nblocks,
pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as
free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts() where name like 'CachedPlan%' or name
= 'PlanCacheContext' group by name;
       name       | count | total_bytes | total_nblocks | free_bytes |
free_chunks | used_bytes
------------------+-------+-------------+---------------+------------+-------------+------------
 CachedPlanQuery  |    10 | 40 kB       |            30 | 17 kB      |
          0 | 23 kB
 CachedPlanSource |    10 | 20 kB       |            20 | 3920 bytes |
          0 | 16 kB
 PlanCacheContext |     1 | 8192 bytes  |             1 | 7952 bytes |
          0 | 240 bytes
(3 rows)

After executing all those 10 statements
#SELECT format('execute all_tables_%s', g.i) from generate_series(1,
10) g(i); \gexec

#SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as
total_bytes, sum(total_nblocks) as total_nblocks,
pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as
free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts() where name like 'CachedPlan%' or name
= 'PlanCacheContext' group by name;
       name       | count | total_bytes | total_nblocks | free_bytes |
free_chunks | used_bytes
------------------+-------+-------------+---------------+------------+-------------+------------
 CachedPlan       |    10 | 20 MB       |           124 | 9388 kB    |
         28 | 11 MB
 CachedPlanQuery  |    10 | 40 kB       |            30 | 17 kB      |
          0 | 23 kB
 CachedPlanSource |    10 | 20 kB       |            20 | 3920 bytes |
          0 | 16 kB
 PlanCacheContext |     1 | 8192 bytes  |             1 | 7952 bytes |
          0 | 240 bytes
(4 rows)

PlanCacheContext is never used for actual planned statements. In fact
I am not sure whether those 8K bytes it's consuming are real or just
context overhead. The real memory is used from CachedPlan* contexts
which are created and destroyed for each prepared statement.

The only use of the shell context is to be able to query memory
context statistics of cached plans, in case we change the names of
contexts for individual planned queries in future.
SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as
total_bytes, sum(total_nblocks) as total_nblocks,
pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as
free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from
pg_get_backend_memory_contexts() where path @> (select path from
pg_get_backend_memory_contexts() where name = 'PlanCacheContext')
group by name;

So separating PlanCacheContext seems to have little use.

[1] https://www.postgresql.org/message-id/dywwv6v6vq3wfqyebypspq7kuez44tnycbvqjspgsqypuunbzn@mzixkn6g47y2
[2] https://www.postgresql.org/message-id/CAGPVpCTJWEQLt2eOSDGTDtRbQPUQ9b9JtZWro9osJubTyWAEMA@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: Separate memory contexts for relcache and catcache

From
Ashutosh Bapat
Date:


On Tue, Nov 26, 2024 at 4:10 PM Rahila Syed <rahilasyed90@gmail.com> wrote:


Having reviewed the discussion regarding potential fragmentation issues caused by 
creating a large number of new contexts in each backend, I would like to take a step 
back and better understand the motivation behind separating these contexts.

IIUC, segregating cache memory allocations into RelCacheContext and CatCacheContext 
allows for grouping a large number of memory allocations under a 
common context, which, in turn, aids in monitoring memory consumption. However, 
I believe this reasoning does not apply to SPICacheContext and PlanCacheContext, 
as these contexts do not have any allocations of their own.

How, then, does separating these contexts from CacheMemoryContext improve monitoring?

A query which accumulates statistics based on the (context) path prefix (path of PlanCacheContext or SPICacheContext) can be used to provide total memory allocated for plans. This will work even if we change the names of child context e.g. CachedPlanContext, CachedQueryContext or if we add more child contexts. Probably such a change is mostly unlikely. Whether that advantage is worth spending extra memory in fragmentation? Probably not. But I just wanted to note some use.

--
Best Wishes,
Ashutosh Bapat

Re: Separate memory contexts for relcache and catcache

From
Melih Mutlu
Date:
Hi Rahila,

Rahila Syed <rahilasyed90@gmail.com>, 26 Kas 2024 Sal, 13:40 tarihinde şunu yazdı:
Observations:
1. While there are a number of child contexts like index info of RelCacheContext,
   CatCacheContext does not have any children.
2. While there is a bunch of used memory in RelCacheContext and CatCacheContext,
SPICacheContext and PlanCacheContext do not have any allocations of their own
and serve only as parents for SPI and CachedPlan related contexts respectively.

Thanks for sharing your observations and the diagram.
 
Having reviewed the discussion regarding potential fragmentation issues caused by 
creating a large number of new contexts in each backend, I would like to take a step 
back and better understand the motivation behind separating these contexts.

IIUC, segregating cache memory allocations into RelCacheContext and CatCacheContext 
allows for grouping a large number of memory allocations under a 
common context, which, in turn, aids in monitoring memory consumption. However, 
I believe this reasoning does not apply to SPICacheContext and PlanCacheContext, 
as these contexts do not have any allocations of their own.

How, then, does separating these contexts from CacheMemoryContext improve monitoring?
 Additionally, IIUC, these contexts are created as long-lived contexts, so they are not designed
 to optimize deletion of all their children via MemoryContextDelete on the parent.

I think it all depends on the level of granularity we want in grouping cache-related memory contexts. Currently, we have relatively low granularity, and increasing it comes with additional memory usage due to the newly introduced memory contexts. Ideally, having separate contexts for each cache type would allow us to see how much memory is allocated for each, as Ashutosh mentioned [1]. Even if a context does not have any allocations of its own, its children might still use some memory. I understand that we can already see total memory usage in, e.g., PlanCacheContext, since all of its children are named CachedPlan* and we can query based on this naming. However, this may not always hold true or could change in the future (though I’m not sure how likely that is).

That said, these changes come with a cost, and it may not be worth it to separate every single cache into its own context. IIUC, introducing contexts for heavily used caches results in much less fragmentation. If that’s the case, then I believe we should focus on RelCache and CatCache, as they are heavily used since the backend starts. I see that you and Ashutosh [2] mentioned that PlanCacheContext is less likely to be heavily used, so we could consider leaving that context out for now.


Attached a separate patch to change initial sizes for relcache and catcache contexts as they grow
 
large from the start. This was suggested in the thread previously [1].
Also changed CacheMemoryContext to use ALLOCSET_START_SMALL_SIZES, so it starts from 1KB.

Applying the same change to use ALLOCSET_START_SMALL_SIZES would be beneficial for 
SPICacheContext and PlanCacheContext contexts as well.

We can even use "ALLOCSET_SMALL_SIZES" if a context rarely has its own allocations, or some non-default sizes. I'm also okay to not introduce those new contexts at all, if that'd be what everyone agrees on.


Thanks,
--
Melih Mutlu
Microsoft