RE: memory leak in logical WAL sender with pgoutput's cachectx - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: memory leak in logical WAL sender with pgoutput's cachectx
Date
Msg-id OS0PR01MB5716FB74667B3AA9B4A68FD19434A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: memory leak in logical WAL sender with pgoutput's cachectx  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: memory leak in logical WAL sender with pgoutput's cachectx
List pgsql-hackers
On Thursday, August 14, 2025 8:49 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote:
> 
> Dear Xuneng,
> 
> > Is it safe to free the substructure from within rel_sync_cache_relation_cb()?
> 
> You referred the comment in rel_sync_cache_relation_cb() right? I understood
> like that we must not access to any *system caches*, from the comment. Here
> we do not re-build caches so that we do not access to the syscaches - it is
> permitted.

We should also avoid freeing the cache within the callback function, as
entries might still be accessed after the callback is invoked. This callback can
be triggered during the execution of pgoutput_change, which poses a risk of
concurrent access issues. For reference, see commit 7f481b8, which addresses a
similar issue. Ideally, cache release should occur within the normal code
execution path, such as within pgoutput_change() or get_rel_sync_entry().

> 
> > I’ also interested in the reasoning behind setting
> > NINVALIDATION_THRESHOLD to 100.
> 
> This is the debatable point of this implementation. I set to 100 because it was
> sufficient with the provided workload, but this may cause the replication lag.
> We may have to consider a benchmark workload, measure data, and consider
> the appropriate value. 100 is just an initial point.

I think it's worthwhile to test the performance impact of this approach. If the
table is altered but not dropped, removing the entries could introduce
additional overhead. Although this overhead might be negligible if within an
acceptable threshold, it still warrants consideration. Additionally, if the
number of invalidations is significant, the cleanup process may take a
considerable amount of time, potentially resulting in performance degradation.
In such cases, it might be beneficial to consider destroying the hash table and
creating a new one. Therefore, analyzing these scenarios is essential.

Best Regards,
Hou zj


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: index prefetching
Next
From: Peter Geoghegan
Date:
Subject: Re: index prefetching