Re: pg_walinspect memory leaks - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: pg_walinspect memory leaks
Date
Msg-id CALj2ACWfRjm_Z7QnkcbZjz2EJeaD=C0Oze=qXG_=mTxzB=jJBA@mail.gmail.com
Whole thread Raw
In response to Re: pg_walinspect memory leaks  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: pg_walinspect memory leaks
Re: pg_walinspect memory leaks
List pgsql-hackers
On Tue, Feb 14, 2023 at 4:07 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Feb 14, 2023 at 6:25 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2023-02-13 15:22:02 -0800, Peter Geoghegan wrote:
> > > More concretely, it looks like GetWALRecordInfo() calls
> > > CStringGetTextDatum/cstring_to_text in a way that accumulates way too
> > > much memory in ExprContext.
> >
> > Additionally, we leak two stringinfos for each record.
> >
> >
> > > This could be avoided by using a separate memory context that is reset
> > > periodically, or something else along the same lines.
> >
> > Everything other than a per-row memory context that's reset each time seems
> > hard to manage in this case.
> >
> > Somehwat funnily, GetWALRecordsInfo() then ends up being unnecessarily
> > dilligent about cleaning up O(1) memory, after not caring about O(N) memory...
>
> Thanks for reporting. I'll get back to you on this soon.

The memory usage goes up with many WAL records in GetWALRecordsInfo().
The affected functions are pg_get_wal_records_info() and
pg_get_wal_records_info_till_end_of_wal(). I think the best way to fix
this is to use a temporary memory context (like the jsonfuncs.c),
reset it after every tuple is put into the tuple store. This fix keeps
the memory under limits. I'm attaching the patches here. For HEAD, I'd
want to be a bit defensive and use the temporary memory context for
pg_get_wal_fpi_info() too.

And, the fix also needs to be back-patched to PG15.

[1]
HEAD:
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
1105979 ubuntu    20   0   28.5g  28.4g 150492 R  80.7  93.0   1:47.12
postgres

PATCHED:
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
  13149 ubuntu    20   0  173244 156872 150688 R 79.0   0.5   1:25.09
postgres

postgres=# select count(*) from
pg_get_wal_records_info_till_end_of_wal('0/1000000');
  count
----------
 35285649
(1 row)

postgres=# select pg_backend_pid();
 pg_backend_pid
----------------
          13149
(1 row)

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16
Next
From: Amit Kapila
Date:
Subject: Re: Allow logical replication to copy tables in binary format