Re: Memory leak in WAL sender with pgoutput (v10~) - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Memory leak in WAL sender with pgoutput (v10~) |
Date | |
Msg-id | CAA4eK1LPDBqTjJMH5kowHq+NExR1vLecAweVEzNSONrsqCc2KQ@mail.gmail.com Whole thread Raw |
In response to | Re: Memory leak in WAL sender with pgoutput (v10~) (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Mon, Dec 2, 2024 at 12:49 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Dec 02, 2024 at 11:47:09AM +0530, Amit Kapila wrote: > > We already have PGOutputData->cachectx which could be used for it. I > > think we should be able to reset such a context when we are > > revalidating the publications. Even, if we want a new context for some > > localized handling, we should add that in PGOutputData rather than a > > local context as the proposed patch is doing at the very least for > > HEAD. > > cachectx is used for the publications and the hash table holding > all the RelationSyncEntry entries, but we lack control of individual > parts within it. So you cannot reset the whole context when > processing a publication invalication. Perhaps adding that to > PGOutputData would be better, but that would be inconsistent with > RelationSyncCache. > AFAICS, RelationSyncCache is not allocated in PGOutputData->cachectx. It is allocated in CacheMemoryContext, see caller of init_rel_sync_cache(). I think you are talking about individual hash entries. Ideally, we can free all entries together and reset cachectx but right now, we are freeing the allocated memory in those entries, if required, at the next access. So, resetting the entire PGOutputData->cachectx won't be possible. But, I don't get why adding new context in PGOutputData for publications would be inconsistent with RelationSyncCache? Anyway, I think it would be okay to retail-free in this case, see the below responses. > > Can't we consider freeing the publication names individually that can > > be backpatchable and have no or minimal risk of breaking anything? > > Sure. The first thing I did was a loop that goes through the > publication list and does individual pfree() for the publication > names. That works, but IMO that's weird as we rely on the internals > of GetPublication() hidden two levels down in pg_publication.c. > We can look at it from a different angle which is that the FreePublication(s) relies on how the knowledge of Publication structure is built. So, it doesn't look weird if we think from that angle. > >> I am slightly concerned about the current design of GetPublication() > >> in the long-term, TBH. LoadPublications() has hidden the leak behind > >> two layers of routines in the WAL sender, and that's easy to miss once > >> you call anything that loads a Publication depending on how the caller > >> caches its data. So I would still choose for modifying the structure > >> on HEAD removing the pstrdup() for the publication name. > >> > > > > BTW, the subscription structure also used the name in a similar way. > > This will make the publication/subscription names handled differently. > > Good point about the inconsistency, so the name could also be switched > to a fixed-NAMEDATALEN there if we were to do that. The subscription > has much more pstrdup() fields, though.. How about having some Free() > routines instead that deal with the whole cleanup of a single list > entry? If that's kept close to the GetPublication() and > GetSubscription() routines, a refresh when changing these structures > would be hard to miss. > We already have FreeSubscription() which free name and other things before calling list_free_deep. So, I thought a call on those lines for publications wouldn't be a bad idea. -- With Regards, Amit Kapila.
pgsql-hackers by date: