Thread: Re: Memory leak in WAL sender with pgoutput (v10~)
On Mon, Dec 2, 2024 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Nov 30, 2024 at 09:28:31AM +0100, Alvaro Herrera wrote: > > I'm not sure about your proposed fix. Isn't it simpler to have another > > memory context which we can reset instead of doing list_free_deep()? It > > doesn't have to be a global memory context -- since this is not > > reentrant and not referenced anywhere else, it can be a simple static > > variable in that block, as in the attached. I ran the stock tests (no > > sysbench) and at least it doesn't crash. > > > > This should be easily backpatchable also, since there's no ABI change. > > Yeah. Using more memory contexts around these API calls done without > the cache manipulation is something I've also mentioned to Amit a few > days ago, and I'm feeling that this concept should be applied to a > broader area than just the cached publication list in light of this > thread and ea792bfd93ab. > We already have PGOutputData->cachectx which could be used for it. I think we should be able to reset such a context when we are revalidating the publications. Even, if we want a new context for some localized handling, we should add that in PGOutputData rather than a local context as the proposed patch is doing at the very least for HEAD. Can't we consider freeing the publication names individually that can be backpatchable and have no or minimal risk of breaking anything? > > I am slightly concerned about the current design of GetPublication() > in the long-term, TBH. LoadPublications() has hidden the leak behind > two layers of routines in the WAL sender, and that's easy to miss once > you call anything that loads a Publication depending on how the caller > caches its data. So I would still choose for modifying the structure > on HEAD removing the pstrdup() for the publication name. > BTW, the subscription structure also used the name in a similar way. This will make the publication/subscription names handled differently. -- With Regards, Amit Kapila.
On Mon, Dec 2, 2024 at 12:49 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Dec 02, 2024 at 11:47:09AM +0530, Amit Kapila wrote: > > We already have PGOutputData->cachectx which could be used for it. I > > think we should be able to reset such a context when we are > > revalidating the publications. Even, if we want a new context for some > > localized handling, we should add that in PGOutputData rather than a > > local context as the proposed patch is doing at the very least for > > HEAD. > > cachectx is used for the publications and the hash table holding > all the RelationSyncEntry entries, but we lack control of individual > parts within it. So you cannot reset the whole context when > processing a publication invalication. Perhaps adding that to > PGOutputData would be better, but that would be inconsistent with > RelationSyncCache. > AFAICS, RelationSyncCache is not allocated in PGOutputData->cachectx. It is allocated in CacheMemoryContext, see caller of init_rel_sync_cache(). I think you are talking about individual hash entries. Ideally, we can free all entries together and reset cachectx but right now, we are freeing the allocated memory in those entries, if required, at the next access. So, resetting the entire PGOutputData->cachectx won't be possible. But, I don't get why adding new context in PGOutputData for publications would be inconsistent with RelationSyncCache? Anyway, I think it would be okay to retail-free in this case, see the below responses. > > Can't we consider freeing the publication names individually that can > > be backpatchable and have no or minimal risk of breaking anything? > > Sure. The first thing I did was a loop that goes through the > publication list and does individual pfree() for the publication > names. That works, but IMO that's weird as we rely on the internals > of GetPublication() hidden two levels down in pg_publication.c. > We can look at it from a different angle which is that the FreePublication(s) relies on how the knowledge of Publication structure is built. So, it doesn't look weird if we think from that angle. > >> I am slightly concerned about the current design of GetPublication() > >> in the long-term, TBH. LoadPublications() has hidden the leak behind > >> two layers of routines in the WAL sender, and that's easy to miss once > >> you call anything that loads a Publication depending on how the caller > >> caches its data. So I would still choose for modifying the structure > >> on HEAD removing the pstrdup() for the publication name. > >> > > > > BTW, the subscription structure also used the name in a similar way. > > This will make the publication/subscription names handled differently. > > Good point about the inconsistency, so the name could also be switched > to a fixed-NAMEDATALEN there if we were to do that. The subscription > has much more pstrdup() fields, though.. How about having some Free() > routines instead that deal with the whole cleanup of a single list > entry? If that's kept close to the GetPublication() and > GetSubscription() routines, a refresh when changing these structures > would be hard to miss. > We already have FreeSubscription() which free name and other things before calling list_free_deep. So, I thought a call on those lines for publications wouldn't be a bad idea. -- With Regards, Amit Kapila.
On 2024-Dec-02, Amit Kapila wrote: > We already have PGOutputData->cachectx which could be used for it. > I think we should be able to reset such a context when we are > revalidating the publications. I don't see that context being reset anywhere, so I have a hard time imagining that this would work without subtle risk of breakage elsewhere. > Even, if we want a new context for some localized handling, we should > add that in PGOutputData rather than a local context as the proposed > patch is doing at the very least for HEAD. I don't necessarily agree, given that this context is not needed anywhere else. > Can't we consider freeing the publication names individually that can > be backpatchable and have no or minimal risk of breaking anything? That was my first thought, but then it occurred to me that such a thing would be totally pointless. > > you call anything that loads a Publication depending on how the caller > > caches its data. So I would still choose for modifying the structure > > on HEAD removing the pstrdup() for the publication name. > > BTW, the subscription structure also used the name in a similar way. > This will make the publication/subscription names handled differently. True (with conninfo, slotname, synccommit, and origin). FWIW it seems FreeSubscription is incomplete, and not only because it fails to free the publication names ... (Why are we storing a string in Subscription->synccommit?) -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Los trabajadores menos efectivos son sistematicamente llevados al lugar donde pueden hacer el menor daño posible: gerencia." (El principio Dilbert)
On Tue, Dec 3, 2024 at 2:12 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Dec-02, Amit Kapila wrote: > > > Even, if we want a new context for some localized handling, we should > > add that in PGOutputData rather than a local context as the proposed > > patch is doing at the very least for HEAD. > > I don't necessarily agree, given that this context is not needed > anywhere else. > But that suits the current design more. We allocate PGOutputData and other contexts in that structure in a "Logical decoding context". A few of its members (publications, publication_names) residing in totally unrelated contexts sounds odd. In the first place, we don't need to allocate publications under CacheMemoryContext, they should be allocated in PGOutputData->cachectx. However, because we need to free those entirely at one-shot during invalidation processing, we could use a new context as a child context of PGOutputData->cachectx. Unless I am missing something, the current memory context usage appears more like a coding convenience than a thoughtful design decision. -- With Regards, Amit Kapila.
On Tue, Dec 3, 2024 at 5:47 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Dec 02, 2024 at 03:29:56PM +0530, Amit Kapila wrote: > > We can look at it from a different angle which is that the > > FreePublication(s) relies on how the knowledge of Publication > > structure is built. So, it doesn't look weird if we think from that > > angle. > > OK, I can live with that on all the stable branches with an extra > list free rather than a deep list free. > > I agree that the memory handling of this whole area needs some rework > to make such leaks harder to introduce in the WAL sender. Still, > let's first solve the problem at hand :) > > So how about the attached that introduces a FreePublication() matching > with GetPublication(), used to do the cleanup? Feel free to comment. > -- Perhaps the patch can use foreach_ptr macro instead of foreach? ====== Kind Regards, Peter Smith. Fujitsu Australia
On 2024-Dec-03, Peter Smith wrote: > Perhaps the patch can use foreach_ptr macro instead of foreach? That cannot be backpatched, though. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end." (2nd Commandment for C programmers)
On Tue, Dec 3, 2024 at 11:57 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Dec 03, 2024 at 09:50:42AM +0530, Amit Kapila wrote: > > But that suits the current design more. We allocate PGOutputData and > > other contexts in that structure in a "Logical decoding context". A > > few of its members (publications, publication_names) residing in > > totally unrelated contexts sounds odd. In the first place, we don't > > need to allocate publications under CacheMemoryContext, they should be > > allocated in PGOutputData->cachectx. However, because we need to free > > those entirely at one-shot during invalidation processing, we could > > use a new context as a child context of PGOutputData->cachectx. Unless > > I am missing something, the current memory context usage appears more > > like a coding convenience than a thoughtful design decision. > > PGOutputData->cachectx has been introduced in 2022 in commit 52e4f0cd4, > while the decision to have RelationSyncEntry and the publication list > in CacheMemoryContext gets down to v10 where this logical replication > has been introduced. This was a carefully-thought choice back then > because this is data that belongs to the process cache, so yes, this > choice makes sense to me. > The parent structure (PGOutputData) was stored in the "Logical decoding context" even in v11. So, how does storing its member 'publications' in CacheMemoryContext a good idea? It is possible that we are leaking memory while doing decoding via SQL APIs where we free decoding context after getting changes though I haven't tested the same. -- With Regards, Amit Kapila.
On Tue, Dec 3, 2024 at 12:17 PM Michael Paquier <michael@paquier.xyz> wrote: > > So how about the attached that introduces a FreePublication() matching > with GetPublication(), used to do the cleanup? Feel free to comment. > As you would have noted I am fine with the fix on these lines but I suggest holding it till we conclude the memory context point raised by me today. It is possible that we are still leaking some memory in other related scenarios. -- With Regards, Amit Kapila.
On Tuesday, December 3, 2024 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Dec 3, 2024 at 11:57 AM Michael Paquier <michael@paquier.xyz> > wrote: > > > > On Tue, Dec 03, 2024 at 09:50:42AM +0530, Amit Kapila wrote: > > > But that suits the current design more. We allocate PGOutputData and > > > other contexts in that structure in a "Logical decoding context". A > > > few of its members (publications, publication_names) residing in > > > totally unrelated contexts sounds odd. In the first place, we don't > > > need to allocate publications under CacheMemoryContext, they should > > > be allocated in PGOutputData->cachectx. However, because we need to > > > free those entirely at one-shot during invalidation processing, we > > > could use a new context as a child context of > > > PGOutputData->cachectx. Unless I am missing something, the current > > > memory context usage appears more like a coding convenience than a > thoughtful design decision. > > > > PGOutputData->cachectx has been introduced in 2022 in commit > > PGOutputData->52e4f0cd4, > > while the decision to have RelationSyncEntry and the publication list > > in CacheMemoryContext gets down to v10 where this logical replication > > has been introduced. This was a carefully-thought choice back then > > because this is data that belongs to the process cache, so yes, this > > choice makes sense to me. > > > > The parent structure (PGOutputData) was stored in the "Logical decoding > context" even in v11. So, how does storing its member 'publications' in > CacheMemoryContext a good idea? It is possible that we are leaking memory > while doing decoding via SQL APIs where we free decoding context after > getting changes though I haven't tested the same. Right. I think I have faced this memory leak recently. It might be true for walsender that 'publications' is a per-process content. But SQL APIs might use different publication names each time during execution. I can reproduce the memory leak due to allocating the publication names under CacheMemoryContext like the following: -- CREATE PUBLICATION pub FOR ALL TABLES; CREATE TABLE stream_test(a int); SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'pgoutput'); INSERT INTO stream_test SELECT generate_series(1, 2, 1); - he backend's memory usage increases with each execution of the following function SELECT count(*) FROM pg_logical_slot_peek_binary_changes('isolation_slot', NULL, NULL, 'proto_version', '4', 'publication_names','pub,pub,........ <lots of pub names>'); Best Regards, Hou zj
On Tuesday, December 3, 2024 4:43 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Dec-02, Amit Kapila wrote: > > > you call anything that loads a Publication depending on how the > > > caller caches its data. So I would still choose for modifying the > > > structure on HEAD removing the pstrdup() for the publication name. > > > > BTW, the subscription structure also used the name in a similar way. > > This will make the publication/subscription names handled differently. > > True (with conninfo, slotname, synccommit, and origin). ... > > (Why are we storing a string in Subscription->synccommit?) I think it's because the primary purpose of sub->synccommit is to serve as a parameter for SetConfigOption() in the apply worker, which requires a string value. Additionally, the existing function set_config_option() that validates this option only accepts a string input. Although we could convert sub->synccommit to an integer, this would necessitate additional conversion code before passing it to these functions. Best Regards, Hou zj