Thread: Separate memory contexts for relcache and catcache
Hi hackers,
Thanks,
--
Most catcache and relcache entries (other than index info etc.) currently go straight into CacheMemoryContext. And I believe these two caches can be the ones with the largest contribution to the memory usage of CacheMemoryContext most of the time. For example, in cases where we have lots of database objects accessed in a long-lived connection, CacheMemoryContext tends to increase significantly.
While I've been working on another patch for pg_backend_memory_contexts view, we thought that it would also be better to see the memory usages of different kinds of caches broken down into their own contexts. The attached patch implements this and aims to easily keep track of the memory used by relcache and catcache
To quickly show how pg_backend_memory_contexts would look like, I did the following:
-Create some tables:
SELECT 'BEGIN;' UNION ALL SELECT format('CREATE TABLE %1$s(id serial primary key, data text not null unique)', 'test_'||g.i) FROM generate_series(0, 1000) g(i) UNION ALL SELECT 'COMMIT;';\gexec
-Open a new connection and query pg_backend_memory_contexts [1]:
This is what you'll see before and after the patch.
-- HEAD:
name | used_bytes | free_bytes | total_bytes
--------------------+------------+------------+-------------
CacheMemoryContext | 467656 | 56632 | 524288
index info | 111760 | 46960 | 158720
relation rules | 4416 | 3776 | 8192
(3 rows)
--------------------+------------+------------+-------------
CacheMemoryContext | 467656 | 56632 | 524288
index info | 111760 | 46960 | 158720
relation rules | 4416 | 3776 | 8192
(3 rows)
-- Patch:
name | used_bytes | free_bytes | total_bytes
-----------------------+------------+------------+-------------
CatCacheMemoryContext | 217696 | 44448 | 262144
RelCacheMemoryContext | 248264 | 13880 | 262144
index info | 111760 | 46960 | 158720
CacheMemoryContext | 2336 | 5856 | 8192
relation rules | 4416 | 3776 | 8192
(5 rows)
-----------------------+------------+------------+-------------
CatCacheMemoryContext | 217696 | 44448 | 262144
RelCacheMemoryContext | 248264 | 13880 | 262144
index info | 111760 | 46960 | 158720
CacheMemoryContext | 2336 | 5856 | 8192
relation rules | 4416 | 3776 | 8192
(5 rows)
- Run select on all tables
SELECT format('SELECT count(*) FROM %1$s', 'test_'||g.i) FROM generate_series(0, 1000) g(i);\gexec
- Then check pg_backend_memory_contexts [1] again:
--HEAD
name | used_bytes | free_bytes | total_bytes
--------------------+------------+------------+-------------
CacheMemoryContext | 8197344 | 257056 | 8454400
index info | 2102160 | 113776 | 2215936
relation rules | 4416 | 3776 | 8192
(3 rows)
--------------------+------------+------------+-------------
CacheMemoryContext | 8197344 | 257056 | 8454400
index info | 2102160 | 113776 | 2215936
relation rules | 4416 | 3776 | 8192
(3 rows)
--Patch
name | used_bytes | free_bytes | total_bytes
-----------------------+------------+------------+-------------
RelCacheMemoryContext | 4706464 | 3682144 | 8388608
CatCacheMemoryContext | 3489384 | 770712 | 4260096
index info | 2102160 | 113776 | 2215936
CacheMemoryContext | 2336 | 5856 | 8192
relation rules | 4416 | 3776 | 8192
(5 rows)
-----------------------+------------+------------+-------------
RelCacheMemoryContext | 4706464 | 3682144 | 8388608
CatCacheMemoryContext | 3489384 | 770712 | 4260096
index info | 2102160 | 113776 | 2215936
CacheMemoryContext | 2336 | 5856 | 8192
relation rules | 4416 | 3776 | 8192
(5 rows)
You can see that CacheMemoryContext does not use much memory without catcache and relcache (at least in cases similar to above), and it's easy to bloat catcache and relcache. That's why I think it would be useful to see their usage separately.
Any feedback would be appreciated.
[1]
SELECT
name,sum(used_bytes) AS used_bytes,sum(free_bytes) AS free_bytes,sum(total_bytes) AS total_bytes
FROM pg_backend_memory_contexts
WHERE name LIKE '%CacheMemoryContext%' OR parent LIKE '%CacheMemoryContext%'
GROUP BY name
ORDER BY total_bytes DESC;
WHERE name LIKE '%CacheMemoryContext%' OR parent LIKE '%CacheMemoryContext%'
GROUP BY name
ORDER BY total_bytes DESC;
--
Melih Mutlu
Microsoft
Attachment
Most catcache and relcache entries (other than index info etc.) currently go straight into CacheMemoryContext. And I believe these two caches can be the ones with the largest contribution to the memory usage of CacheMemoryContext most of the time. For example, in cases where we have lots of database objects accessed in a long-lived connection, CacheMemoryContext tends to increase significantly.While I've been working on another patch for pg_backend_memory_contexts view, we thought that it would also be better to see the memory usages of different kinds of caches broken down into their own contexts. The attached patch implements this and aims to easily keep track of the memory used by relcache and catcache
+ 1 for the idea, this would be pretty useful as a proof of which
context is consuming most of the memory and it doesn't cost
much. It would be handy than estimating that by something
like select count(*) from pg_class.
I think, for example, if we find relcache using too much memory,
it is a signal that the user may use too many partitioned tables.
Best Regards
Andy Fan
On 2023-Aug-09, Melih Mutlu wrote: > --Patch > name | used_bytes | free_bytes | total_bytes > -----------------------+------------+------------+------------- > RelCacheMemoryContext | 4706464 | 3682144 | 8388608 > CatCacheMemoryContext | 3489384 | 770712 | 4260096 > index info | 2102160 | 113776 | 2215936 > CacheMemoryContext | 2336 | 5856 | 8192 > relation rules | 4416 | 3776 | 8192 > (5 rows) Hmm, is this saying that there's too much fragmentation in the relcache context? Maybe it would improve things to make it a SlabContext instead of AllocSet. Or, more precisely, a bunch of SlabContexts, each with the appropriate chunkSize for the object being stored. (I don't say this because I know for a fact that Slab is better for these purposes; it's just that I happened to read its comments yesterday and they stated that it behaves better in terms of fragmentation. Maybe Andres or Tomas have an opinion on this.) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "I love the Postgres community. It's all about doing things _properly_. :-)" (David Garamond)
On Thu, 10 Aug 2023 at 01:23, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2023-Aug-09, Melih Mutlu wrote: > > > --Patch > > name | used_bytes | free_bytes | total_bytes > > -----------------------+------------+------------+------------- > > RelCacheMemoryContext | 4706464 | 3682144 | 8388608 > > CatCacheMemoryContext | 3489384 | 770712 | 4260096 > > index info | 2102160 | 113776 | 2215936 > > CacheMemoryContext | 2336 | 5856 | 8192 > > relation rules | 4416 | 3776 | 8192 > > (5 rows) > > Hmm, is this saying that there's too much fragmentation in the relcache > context? free_bytes is just the space in the blocks that are not being used by any allocated chunks or chunks on the freelist. It looks like RelCacheMemoryContext has 10 blocks including the 8kb initial block: postgres=# select 8192 + sum(8192*power(2,x)) as total_bytes from generate_series(0,9) x; total_bytes ------------- 8388608 The first 2 blocks are 8KB as we only start doubling after we malloc the first 8kb block after the keeper block. If there was 1 fewer block then total_bytes would be 4194304, which is less than the used_bytes for that context, so those 10 block look needed. > Maybe it would improve things to make it a SlabContext instead > of AllocSet. Or, more precisely, a bunch of SlabContexts, each with the > appropriate chunkSize for the object being stored. It would at least save from having to do the power of 2 rounding that aset does. However, on a quick glance, it seems not all the size requests in relcache.c are fixed. I see a datumCopy() in RelationBuildTupleDesc() for the attmissingval stuff, so we couldn't SlabAlloc that. It could be worth looking at the size classes of the fixed-sized allocations to estimate how much memory we might save by using slab to avoid the power-2 rounding that aset.c does. However, if there are too many contexts then we may end up using more memory with all the mostly-empty contexts for backends that only query a tiny number of tables. That might not be good. Slab also does not do block doubling like aset does, so it might be hard to choose a good block size. > (I don't say this > because I know for a fact that Slab is better for these purposes; it's > just that I happened to read its comments yesterday and they stated that > it behaves better in terms of fragmentation. Maybe Andres or Tomas have > an opinion on this.) I'm not sure of the exact comment, but I was in the recently and there's a chance that I wrote that comment. Slab priorities putting new chunks on fuller blocks and may free() blocks once they become empty of any chunks. Aset does no free()ing of blocks unless a block was malloc()ed especially for a chunk above allocChunkLimit. That means aset might hold a lot of malloc'ed memory for chunks that just sit on freelists which might never be used ever again, meanwhile, other request sizes may have to malloc new blocks. David
Hi, On 2023-08-09 15:02:31 +0300, Melih Mutlu wrote: > To quickly show how pg_backend_memory_contexts would look like, I did the > following: > > -Create some tables: > SELECT 'BEGIN;' UNION ALL SELECT format('CREATE TABLE %1$s(id serial > primary key, data text not null unique)', 'test_'||g.i) FROM > generate_series(0, 1000) g(i) UNION ALL SELECT 'COMMIT;';\gexec > > -Open a new connection and query pg_backend_memory_contexts [1]: > This is what you'll see before and after the patch. > -- HEAD: > name | used_bytes | free_bytes | total_bytes > --------------------+------------+------------+------------- > CacheMemoryContext | 467656 | 56632 | 524288 > index info | 111760 | 46960 | 158720 > relation rules | 4416 | 3776 | 8192 > (3 rows) > > -- Patch: > name | used_bytes | free_bytes | total_bytes > -----------------------+------------+------------+------------- > CatCacheMemoryContext | 217696 | 44448 | 262144 > RelCacheMemoryContext | 248264 | 13880 | 262144 > index info | 111760 | 46960 | 158720 > CacheMemoryContext | 2336 | 5856 | 8192 > relation rules | 4416 | 3776 | 8192 > (5 rows) Have you checked what the source of the remaining allocations in CacheMemoryContext are? One thing that I had observed previously and reproduced with this patch, is that the first backend starting after a restart uses considerably more memory: first: ┌───────────────────────┬────────────┬────────────┬─────────────┐ │ name │ used_bytes │ free_bytes │ total_bytes │ ├───────────────────────┼────────────┼────────────┼─────────────┤ │ CatCacheMemoryContext │ 370112 │ 154176 │ 524288 │ │ RelCacheMemoryContext │ 244136 │ 18008 │ 262144 │ │ index info │ 104392 │ 45112 │ 149504 │ │ CacheMemoryContext │ 2304 │ 5888 │ 8192 │ │ relation rules │ 3856 │ 240 │ 4096 │ └───────────────────────┴────────────┴────────────┴─────────────┘ second: ┌───────────────────────┬────────────┬────────────┬─────────────┐ │ name │ used_bytes │ free_bytes │ total_bytes │ ├───────────────────────┼────────────┼────────────┼─────────────┤ │ CatCacheMemoryContext │ 215072 │ 47072 │ 262144 │ │ RelCacheMemoryContext │ 243856 │ 18288 │ 262144 │ │ index info │ 104944 │ 47632 │ 152576 │ │ CacheMemoryContext │ 2304 │ 5888 │ 8192 │ │ relation rules │ 3856 │ 240 │ 4096 │ └───────────────────────┴────────────┴────────────┴─────────────┘ This isn't caused by this patch, but it does make it easier to pinpoint than before. The reason is fairly simple: On the first start we start without being able to use relcache init files, in later starts we can. The reason the size increase is in CatCacheMemoryContext, rather than RelCacheMemoryContext, is simple: When using the init file the catcache isn't used, when not, we have to query the catcache a lot to build the initial relcache contents. Given the size of both CatCacheMemoryContext and RelCacheMemoryContext in a new backend, I think it might be worth using non-default aset parameters. A bit ridiculous to increase block sizes from 8k upwards in every single connection made to postgres ever. > - Run select on all tables > SELECT format('SELECT count(*) FROM %1$s', 'test_'||g.i) FROM > generate_series(0, 1000) g(i);\gexec > > - Then check pg_backend_memory_contexts [1] again: > --HEAD > name | used_bytes | free_bytes | total_bytes > --------------------+------------+------------+------------- > CacheMemoryContext | 8197344 | 257056 | 8454400 > index info | 2102160 | 113776 | 2215936 > relation rules | 4416 | 3776 | 8192 > (3 rows) > > --Patch > name | used_bytes | free_bytes | total_bytes > -----------------------+------------+------------+------------- > RelCacheMemoryContext | 4706464 | 3682144 | 8388608 > CatCacheMemoryContext | 3489384 | 770712 | 4260096 > index info | 2102160 | 113776 | 2215936 > CacheMemoryContext | 2336 | 5856 | 8192 > relation rules | 4416 | 3776 | 8192 > (5 rows) > > You can see that CacheMemoryContext does not use much memory without > catcache and relcache (at least in cases similar to above), and it's easy > to bloat catcache and relcache. That's why I think it would be useful to > see their usage separately. Yes, I think it'd be quite useful. There's ways to bloat particularly catcache much further, and it's hard to differentiate that from other sources of bloat right now. > +static void > +CreateCatCacheMemoryContext() We typically use (void) to differentiate from an older way of function declarations that didn't have argument types. > +{ > + if (!CacheMemoryContext) > + CreateCacheMemoryContext(); I wish we just made sure that cache memory context were created in the right place, instead of spreading this check everywhere... > @@ -3995,9 +3998,9 @@ RelationCacheInitializePhase2(void) > return; > > /* > - * switch to cache memory context > + * switch to relcache memory context > */ > - oldcxt = MemoryContextSwitchTo(CacheMemoryContext); > + oldcxt = MemoryContextSwitchTo(RelCacheMemoryContext); > > /* > * Try to load the shared relcache cache file. If unsuccessful, bootstrap > @@ -4050,9 +4053,9 @@ RelationCacheInitializePhase3(void) > RelationMapInitializePhase3(); > > /* > - * switch to cache memory context > + * switch to relcache memory context > */ > - oldcxt = MemoryContextSwitchTo(CacheMemoryContext); > + oldcxt = MemoryContextSwitchTo(RelCacheMemoryContext); > > /* > * Try to load the local relcache cache file. If unsuccessful, bootstrap I'd just delete these comments, they're just pointlessly restating the code. Greetings, Andres Freund
Hi, I also think this change would be helpful. I imagine you're working on the Andres's comments and you already notice this, but v1 patch cannot be applied to HEAD. For the convenience of other reviewers, I marked it 'Waiting on Author'. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Hi,
torikoshia <torikoshia@oss.nttdata.com>, 4 Ara 2023 Pzt, 07:59 tarihinde şunu yazdı:
Hi,
I also think this change would be helpful.
I imagine you're working on the Andres's comments and you already notice
this, but v1 patch cannot be applied to HEAD.
For the convenience of other reviewers, I marked it 'Waiting on Author'.
Thanks for letting me know. I rebased the patch. PFA new version.
Andres Freund <andres@anarazel.de>, 12 Eki 2023 Per, 20:01 tarihinde şunu yazdı:
Hi,
Have you checked what the source of the remaining allocations in
CacheMemoryContext are?
It's mostly typecache, around 2K. Do you think typecache also needs a separate context?
Given the size of both CatCacheMemoryContext and RelCacheMemoryContext in a
new backend, I think it might be worth using non-default aset parameters. A
bit ridiculous to increase block sizes from 8k upwards in every single
connection made to postgres ever.
Considering it starts from ~262K, what would be better for init size? 256K?
> +static void
> +CreateCatCacheMemoryContext()
We typically use (void) to differentiate from an older way of function
declarations that didn't have argument types.
Done.
> +{
> + if (!CacheMemoryContext)
> + CreateCacheMemoryContext();
I wish we just made sure that cache memory context were created in the right
place, instead of spreading this check everywhere...
That would be nice. Do you have a suggestion about where that right place would be?
I'd just delete these comments, they're just pointlessly restating the code.
Done.
Thanks,
Melih Mutlu
Microsoft
Attachment
On Wed, 3 Jan 2024 at 16:56, Melih Mutlu <m.melihmutlu@gmail.com> wrote: > > Hi, > > torikoshia <torikoshia@oss.nttdata.com>, 4 Ara 2023 Pzt, 07:59 tarihinde şunu yazdı: >> >> Hi, >> >> I also think this change would be helpful. >> >> I imagine you're working on the Andres's comments and you already notice >> this, but v1 patch cannot be applied to HEAD. >> For the convenience of other reviewers, I marked it 'Waiting on Author'. > > > Thanks for letting me know. I rebased the patch. PFA new version. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID 729439607ad210dbb446e31754e8627d7e3f7dda === === applying patch ./v2-0001-Separate-memory-contexts-for-relcache-and-catcach.patch patching file src/backend/utils/cache/catcache.c ... Hunk #8 FAILED at 1933. Hunk #9 succeeded at 2253 (offset 84 lines). 1 out of 9 hunks FAILED -- saving rejects to file src/backend/utils/cache/catcache.c.rej Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_4554.log Regards, Vignesh
vignesh C <vignesh21@gmail.com>, 27 Oca 2024 Cmt, 06:01 tarihinde şunu yazdı:
On Wed, 3 Jan 2024 at 16:56, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
729439607ad210dbb446e31754e8627d7e3f7dda ===
=== applying patch
./v2-0001-Separate-memory-contexts-for-relcache-and-catcach.patch
patching file src/backend/utils/cache/catcache.c
...
Hunk #8 FAILED at 1933.
Hunk #9 succeeded at 2253 (offset 84 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/utils/cache/catcache.c.rej
Please post an updated version for the same.
[1] - http://cfbot.cputube.org/patch_46_4554.log
Regards,
Vignesh
Rebased. PSA.
Melih Mutlu
Microsoft
Attachment
On Wed, 2024-04-03 at 16:12 +0300, Melih Mutlu wrote: > Rebased. PSA. Thank you. I missed your patch and came up with a similar patch over here: https://www.postgresql.org/message-id/flat/78599c442380ddb5990117e281a4fa65a74231af.camel@j-davis.com I closed my thread and we can continue this one. One difference is that I tried to capture almost all uses of CacheMemoryContext so that it would become just a parent context without many allocations of its own. The plan cache and SPI caches can be important, too. Or, one of the other caches that we expect to be small might grow in some edge cases (or due to a bug), and it would be good to be able to see that. I agree with others that we should look at changing the initial size or type of the contexts, but that should be a separate commit. Regards, Jeff Davis
Hi, On 2024-10-29 15:00:02 -0700, Jeff Davis wrote: > On Wed, 2024-04-03 at 16:12 +0300, Melih Mutlu wrote: > > Rebased. PSA. > > Thank you. I missed your patch and came up with a similar patch over > here: > > https://www.postgresql.org/message-id/flat/78599c442380ddb5990117e281a4fa65a74231af.camel@j-davis.com > > I closed my thread and we can continue this one. > > One difference is that I tried to capture almost all uses of > CacheMemoryContext so that it would become just a parent context > without many allocations of its own. I'm a bit worried about the increase in "wasted" memory we might end up when creating one aset for *everything*. Just splitting out Relcache and CatCache isn't a big deal from that angle, they're always used reasonably much. But creating a bunch of barely used contexts does have the potential for lots of memory being wasted at the end of a page and on freelists. It might be ok as far was what you proposed in the above email, I haven't analyzed that in depth yet. > I agree with others that we should look at changing the initial size or > type of the contexts, but that should be a separate commit. It needs to be done close together though, otherwise we'll increase the new-connection-memory-usage of postgres measurably. I've previously proposed creating a type of memory context that's intended for places where we never expect to allocate much which allocates from either a superior memory context or just from the system allocator and tracks memory via linked lists. That'd allow us to use fairly granular memory contexts with low overhead, which we e.g. could use to actually create each catcache & relcache entry in its own context. One concern that was voiced about that idea was that it'd perform badly if such a context did end up being used hotly - I'm not sure that's a real problem, but we could address it by switching to a different allocation scheme once a certain size is reached. Greetings, Andres Freund
On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote: > I'm a bit worried about the increase in "wasted" memory we might end > up when > creating one aset for *everything*. Just splitting out Relcache and > CatCache > isn't a big deal from that angle, they're always used reasonably > much. But > creating a bunch of barely used contexts does have the potential for > lots of > memory being wasted at the end of a page and on freelists. It might > be ok as > far was what you proposed in the above email, I haven't analyzed that > in depth > yet. Melih raised similar concerns. The new contexts that my patch created were CatCacheContext, RelCacheContext, SPICacheContext, PgOutputContext, PlanCacheContext, TextSearchCacheContext, and TypCacheContext. Those are all created lazily, so you need to at least be using the relevant feature before it has any cost (with the exception of the first two). > > I agree with others that we should look at changing the initial > > size or > > type of the contexts, but that should be a separate commit. > > It needs to be done close together though, otherwise we'll increase > the > new-connection-memory-usage of postgres measurably. I don't have a strong opinion here; that was a passing comment. But I'm curious: why it would increase the per-connection memory usage much to just have a couple new memory contexts? > I've previously proposed creating a type of memory context that's > intended for > places where we never expect to allocate much which allocates from > either a > superior memory context or just from the system allocator and tracks > memory > via linked lists. Why not just use ALLOCSET_SMALL_SIZES? Regards, Jeff Davis
Hi, On 2024-11-01 14:47:37 -0700, Jeff Davis wrote: > On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote: > > I'm a bit worried about the increase in "wasted" memory we might end > > up when > > creating one aset for *everything*. Just splitting out Relcache and > > CatCache > > isn't a big deal from that angle, they're always used reasonably > > much. But > > creating a bunch of barely used contexts does have the potential for > > lots of > > memory being wasted at the end of a page and on freelists. It might > > be ok as > > far was what you proposed in the above email, I haven't analyzed that > > in depth > > yet. > > Melih raised similar concerns. The new contexts that my patch created > were CatCacheContext, RelCacheContext, SPICacheContext, > PgOutputContext, PlanCacheContext, TextSearchCacheContext, and > TypCacheContext. > > Those are all created lazily, so you need to at least be using the > relevant feature before it has any cost (with the exception of the > first two). Well, you can't get very far without using at least CatCacheContext, RelCacheContext, PlanCacheContext, TypCacheContext. The others are indeed much more specific and not really worth worrying about. > > > I agree with others that we should look at changing the initial > > > size or > > > type of the contexts, but that should be a separate commit. > > > > It needs to be done close together though, otherwise we'll increase > > the > > new-connection-memory-usage of postgres measurably. > > I don't have a strong opinion here; that was a passing comment. But I'm > curious: why it would increase the per-connection memory usage much to > just have a couple new memory contexts? "much" is maybe too strong. But the memory usage in a new connection is fairly low, it doesn't take a large increase to be noticeable percentage-wise. And given how much people love having poolers full of idle connections, it shows up in aggregate. > > I've previously proposed creating a type of memory context that's > > intended for > > places where we never expect to allocate much which allocates from > > either a > > superior memory context or just from the system allocator and tracks > > memory > > via linked lists. > > Why not just use ALLOCSET_SMALL_SIZES? That helps some, but not *that* much. You still end up with a bunch of partially filled blocks. Here's e.g. an excerpt with your patch applied: │ name │ ident │ type │ level │ path │ total_bytes│ total_nblocks │ free_bytes │ free_chunks │ used_bytes │ ├──────────────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼───────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ │ CacheMemoryContext │ (null) │ AllocSet │ 2 │ {1,19} │ 8192 │ 1 │ 7952 │ 0 │ 240 │ │ TypCacheContext │ (null) │ AllocSet │ 3 │ {1,19,28} │ 8192 │ 1 │ 4816 │ 0 │ 3376 │ │ search_path processing cache │ (null) │ AllocSet │ 3 │ {1,19,29} │ 8192 │ 1 │ 5280 │ 7 │ 2912 │ │ CatCacheContext │ (null) │ AllocSet │ 3 │ {1,19,30} │ 262144 │ 6 │ 14808 │ 0 │ 247336 │ │ RelCacheContext │ (null) │ AllocSet │ 3 │ {1,19,31} │ 262144 │ 6 │ 8392 │ 2 │ 253752 │ │ relation rules │ pg_backend_memory_contexts │ AllocSet │ 4 │ {1,19,31,34} │ 8192 │ 4 │ 3280 │ 1 │ 4912 │ │ index info │ manyrows_pkey │ AllocSet │ 4 │ {1,19,31,35} │ 2048 │ 2 │ 864 │ 1 │ 1184 │ │ index info │ pg_statistic_ext_relid_index │ AllocSet │ 4 │ {1,19,31,36} │ 2048 │ 2 │ 928 │ 1 │ 1120 │ │ index info │ pg_class_tblspc_relfilenode_index │ AllocSet │ 4 │ {1,19,31,37} │ 2048 │ 2 │ 440 │ 1 │ 1608 │ (this is a tiny bit misleading as "search_path processing cache" was just moved") You can quickly see that the various contexts have a decent amount of free space, some of their space. We've already been more aggressive about using separate contets for indexes - and in aggregate that memory usage shows up: postgres[1088243][1]=# SELECT count(*), sum(total_bytes) as total_bytes, sum(total_nblocks) as total_nblocks, sum(free_bytes)free_bytes, sum(free_chunks) as free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts WHEREpath @> (SELECT path FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext') and name = 'index info' ┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐ │ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │ ├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ │ 87 │ 162816 │ 144 │ 48736 │ 120 │ 114080 │ └───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘ And it's not just the partially filled blocks that are an "issue", it's also the freelists that are much less likely to be used soon if they're split very granularly. Often we'll end up with memory in freelists that are created while building some information that then will not be used again. Without your patch: ┌────────────────────┬────────────────────────────────────────────────┬──────────┬───────┬────────────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐ │ name │ ident │ type │ level │ path │ total_bytes │ total_nblocks│ free_bytes │ free_chunks │ used_bytes │ ├────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ │ CacheMemoryContext │ (null) │ AllocSet │ 2 │ {1,17} │ 524288 │ 7 │ 75448 │ 0 │ 448840 │ │ relation rules │ pg_backend_memory_contexts │ AllocSet │ 3 │ {1,17,27} │ 8192 │ 4 │ 3472 │ 4 │ 4720 │ ... Greetings, Andres Freund
On Sat, Nov 2, 2024 at 3:17 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Fri, 2024-11-01 at 15:19 -0400, Andres Freund wrote: > > I'm a bit worried about the increase in "wasted" memory we might end > > up when > > creating one aset for *everything*. Just splitting out Relcache and > > CatCache > > isn't a big deal from that angle, they're always used reasonably > > much. But > > creating a bunch of barely used contexts does have the potential for > > lots of > > memory being wasted at the end of a page and on freelists. It might > > be ok as > > far was what you proposed in the above email, I haven't analyzed that > > in depth > > yet. > > Melih raised similar concerns. The new contexts that my patch created > were CatCacheContext, RelCacheContext, SPICacheContext, > PgOutputContext, PlanCacheContext, TextSearchCacheContext, and > TypCacheContext. > > Those are all created lazily, so you need to at least be using the > relevant feature before it has any cost (with the exception of the > first two). > > > > I agree with others that we should look at changing the initial > > > size or > > > type of the contexts, but that should be a separate commit. > > > > It needs to be done close together though, otherwise we'll increase > > the > > new-connection-memory-usage of postgres measurably. > > I don't have a strong opinion here; that was a passing comment. But I'm > curious: why it would increase the per-connection memory usage much to > just have a couple new memory contexts? Without patch First backend SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts(); count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes -------+-------------+---------------+------------+-------------+------------ 121 | 1917 kB | 208 | 716 kB | 128 | 1201 kB (1 row) Second backend SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts(); count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes -------+-------------+---------------+------------+-------------+------------ 121 | 1408 kB | 210 | 384 kB | 186 | 1024 kB (1 row) With both patches from Melih applied First backend SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts(); count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes -------+-------------+---------------+------------+-------------+------------ 124 | 1670 kB | 207 | 467 kB | 128 | 1203 kB (1 row) Second backend SELECT count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts(); count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes -------+-------------+---------------+------------+-------------+------------ 124 | 1417 kB | 209 | 391 kB | 187 | 1026 kB (1 row) So it looks like the patches do reduce memory allocated at the start of a backend. That is better as far as the conditions just after the backend start are concerned. The chunks of memory allocated in a given context will more likely have similar sizes since they will be allocated for the same types of objects as compared to one big context where chunks are allocated for many different kinds of objects. I believe this will lead to a better utilization of freelist. -- Best Wishes, Ashutosh Bapat
On Sat, Nov 2, 2024 at 4:18 AM Andres Freund <andres@anarazel.de> wrote: > > > > > I've previously proposed creating a type of memory context that's > > > intended for > > > places where we never expect to allocate much which allocates from > > > either a > > > superior memory context or just from the system allocator and tracks > > > memory > > > via linked lists. > > > > Why not just use ALLOCSET_SMALL_SIZES? > > That helps some, but not *that* much. You still end up with a bunch of partially > filled blocks. Here's e.g. an excerpt with your patch applied: > > │ name │ ident │ type │ level │ path │ total_bytes│ total_nblocks │ free_bytes │ free_chunks │ used_bytes │ > ├──────────────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼───────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ > │ CacheMemoryContext │ (null) │ AllocSet │ 2 │ {1,19} │ 8192 │ 1 │ 7952 │ 0 │ 240 │ > │ TypCacheContext │ (null) │ AllocSet │ 3 │ {1,19,28} │ 8192 │ 1 │ 4816 │ 0 │ 3376 │ > │ search_path processing cache │ (null) │ AllocSet │ 3 │ {1,19,29} │ 8192 │ 1 │ 5280 │ 7 │ 2912 │ > │ CatCacheContext │ (null) │ AllocSet │ 3 │ {1,19,30} │ 262144 │ 6 │ 14808 │ 0 │ 247336 │ > │ RelCacheContext │ (null) │ AllocSet │ 3 │ {1,19,31} │ 262144 │ 6 │ 8392 │ 2 │ 253752 │ > │ relation rules │ pg_backend_memory_contexts │ AllocSet │ 4 │ {1,19,31,34} │ 8192 │ 4 │ 3280 │ 1 │ 4912 │ > │ index info │ manyrows_pkey │ AllocSet │ 4 │ {1,19,31,35} │ 2048 │ 2 │ 864 │ 1 │ 1184 │ > │ index info │ pg_statistic_ext_relid_index │ AllocSet │ 4 │ {1,19,31,36} │ 2048 │ 2 │ 928 │ 1 │ 1120 │ > │ index info │ pg_class_tblspc_relfilenode_index │ AllocSet │ 4 │ {1,19,31,37} │ 2048 │ 2 │ 440 │ 1 │ 1608 │ > > (this is a tiny bit misleading as "search_path processing cache" was just moved") > > You can quickly see that the various contexts have a decent amount of free > space, some of their space. > > We've already been more aggressive about using separate contets for indexes - > and in aggregate that memory usage shows up: > > postgres[1088243][1]=# SELECT count(*), sum(total_bytes) as total_bytes, sum(total_nblocks) as total_nblocks, sum(free_bytes)free_bytes, sum(free_chunks) as free_chunks, sum(used_bytes) used_bytes FROM pg_backend_memory_contexts WHEREpath @> (SELECT path FROM pg_backend_memory_contexts WHERE name = 'CacheMemoryContext') and name = 'index info' > ┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐ > │ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │ > ├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ > │ 87 │ 162816 │ 144 │ 48736 │ 120 │ 114080 │ > └───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘ > > > > And it's not just the partially filled blocks that are an "issue", it's also > the freelists that are much less likely to be used soon if they're split very > granularly. Often we'll end up with memory in freelists that are created while > building some information that then will not be used again. > > > Without your patch: > ┌────────────────────┬────────────────────────────────────────────────┬──────────┬───────┬────────────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐ > │ name │ ident │ type │ level │ path │ total_bytes │ total_nblocks│ free_bytes │ free_chunks │ used_bytes │ > ├────────────────────┼────────────────────────────────────────────────┼──────────┼───────┼────────────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤ > │ CacheMemoryContext │ (null) │ AllocSet │ 2 │ {1,17} │ 524288 │ 7 │ 75448 │ 0 │ 448840 │ > │ relation rules │ pg_backend_memory_contexts │ AllocSet │ 3 │ {1,17,27} │ 8192 │ 4 │ 3472 │ 4 │ 4720 │ > ... If these caches are not used at all, this might be a problem. But I think the applications which use TextSearchCacheContext, let's say, are likely to use it so frequently that the free chunks will be recycled. So, I don't know whether that will be a huge problem with partial blocks and freelists. However, we agree that it's generally good to have (at least some) specific contexts as children of cache memory context. It will be good to move ahead with the ones we all agree for now. Looking at all the emails, those will be CatCacheContext, RelCacheContext, PlanCacheContext, TypCacheContext. If we go with fewer context, it will be good not to lose the work Jeff did for other contexts though. I like those Create*CacheContext() functions. They identify various specific uses of CacheMemoryContext. In future, if we think that we need specific contexts for some of those, these will be the functions where we will create specific contexts. We might need to change the name of those functions to Get*CacheContext() instead of Create since they won't create a context right now. -- Best Wishes, Ashutosh Bapat
On Mon, 2024-11-11 at 17:05 +0530, Ashutosh Bapat wrote: > It will be good > to move ahead with the ones we all agree for now. Looking at all the > emails, those will be CatCacheContext, > RelCacheContext, PlanCacheContext, TypCacheContext. I'm not sure we have consensus on all of those yet. Andres's concern, IIUC, is that the additional memory contexts will cause additional fragmentation. I believe we have a rough consensus that CatCacheContext and RelCacheContext are wanted, but we're trying to find ways to mitigate the fragmentation. Regards, Jeff Davis
On Tue, Nov 12, 2024 at 2:57 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Mon, 2024-11-11 at 17:05 +0530, Ashutosh Bapat wrote: > > It will be good > > to move ahead with the ones we all agree for now. Looking at all the > > emails, those will be CatCacheContext, > > RelCacheContext, PlanCacheContext, TypCacheContext. > > I'm not sure we have consensus on all of those yet. Andres's concern, > IIUC, is that the additional memory contexts will cause additional > fragmentation. > > I believe we have a rough consensus that CatCacheContext and > RelCacheContext are wanted, but we're trying to find ways to mitigate > the fragmentation. The totals (free_bytes, total_bytes, used_bytes) of memory contexts separated from CacheMemoryContext and those without separate are (35968, 540672, 504704) vs (75448,524288,448840). There's about 20K increased in used_bytes and total_bytes. And we guess/know that that increase is because of fragmentation. Am I right? But I don't find any reference to what load Andres ran which resulted in this state [1]. So can not make a judgement of whether that increase represents a typical case or not. I experimented with the plan cache context. I created 1000 tables using Melih's [2] queries. But moved them into a single partitioned table. With no prepared statement #SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_by tes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts() where name like 'CachedPlan%' or name = 'PlanCacheContext' group by name; name | count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes ------------------+-------+-------------+---------------+------------+-------------+------------ PlanCacheContext | 1 | 8192 bytes | 1 | 7952 bytes | 0 | 240 bytes (1 row) With 10 prepared statement each selecting from the partitioned table #SELECT format('prepare all_tables_%s as SELECT count(*) FROM test', g.i) from generate_series(1, 10) g(i); \gexec #SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts() where name like 'CachedPlan%' or name = 'PlanCacheContext' group by name; name | count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes ------------------+-------+-------------+---------------+------------+-------------+------------ CachedPlanQuery | 10 | 40 kB | 30 | 17 kB | 0 | 23 kB CachedPlanSource | 10 | 20 kB | 20 | 3920 bytes | 0 | 16 kB PlanCacheContext | 1 | 8192 bytes | 1 | 7952 bytes | 0 | 240 bytes (3 rows) After executing all those 10 statements #SELECT format('execute all_tables_%s', g.i) from generate_series(1, 10) g(i); \gexec #SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts() where name like 'CachedPlan%' or name = 'PlanCacheContext' group by name; name | count | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes ------------------+-------+-------------+---------------+------------+-------------+------------ CachedPlan | 10 | 20 MB | 124 | 9388 kB | 28 | 11 MB CachedPlanQuery | 10 | 40 kB | 30 | 17 kB | 0 | 23 kB CachedPlanSource | 10 | 20 kB | 20 | 3920 bytes | 0 | 16 kB PlanCacheContext | 1 | 8192 bytes | 1 | 7952 bytes | 0 | 240 bytes (4 rows) PlanCacheContext is never used for actual planned statements. In fact I am not sure whether those 8K bytes it's consuming are real or just context overhead. The real memory is used from CachedPlan* contexts which are created and destroyed for each prepared statement. The only use of the shell context is to be able to query memory context statistics of cached plans, in case we change the names of contexts for individual planned queries in future. SELECT name, count(*), pg_size_pretty(sum(total_bytes)) as total_bytes, sum(total_nblocks) as total_nblocks, pg_size_pretty(sum(free_bytes)) free_bytes, sum(free_chunks) as free_chunks, pg_size_pretty(sum(used_bytes)) used_bytes from pg_get_backend_memory_contexts() where path @> (select path from pg_get_backend_memory_contexts() where name = 'PlanCacheContext') group by name; So separating PlanCacheContext seems to have little use. [1] https://www.postgresql.org/message-id/dywwv6v6vq3wfqyebypspq7kuez44tnycbvqjspgsqypuunbzn@mzixkn6g47y2 [2] https://www.postgresql.org/message-id/CAGPVpCTJWEQLt2eOSDGTDtRbQPUQ9b9JtZWro9osJubTyWAEMA@mail.gmail.com -- Best Wishes, Ashutosh Bapat