Re: Add GUC to tune glibc's malloc implementation. - Mailing list pgsql-hackers

From Ronan Dunklau
Subject Re: Add GUC to tune glibc's malloc implementation.
Date
Msg-id 1900350.taCxCBeP46@aivenlaptop
Whole thread Raw
In response to Re: Add GUC to tune glibc's malloc implementation.  (Andres Freund <andres@anarazel.de>)
Responses Re: Add GUC to tune glibc's malloc implementation.
List pgsql-hackers
Le mardi 27 juin 2023, 20:17:46 CEST Andres Freund a écrit :
> > Yes this is probably much more appropriate, but a much larger change with
> > greater risks of regression. Especially as we have to make sure we're not
> > overfitting our own code for a specific malloc implementation, to the
> > detriment of others.
>
> I think your approach is fundamentally overfitting our code to a specific
> malloc implementation, in a way that's not tunable by mere mortals. It just
> seems like a dead end to me.

I see it as a way to have *some* sort of control over the malloc
implementation we use, instead of tuning our allocations pattern on top of it
while treating it entirely as a black box. As for the tuning, I proposed
earlier to replace this parameter expressed in terms of size as a "profile"
(greedy / conservative) to make it easier to pick a sensible value.

>
> > Except if you hinted we should write our own directly instead ?
>
> I don't think we should write our own malloc - we don't rely on it much
> ourselves. And if we replace it, we need to care about mallocs performance
> characteristics a whole lot, because various libraries etc do heavily rely
> on it.
>
> However, I do think we should eventually avoid using malloc() for aset.c et
> al. malloc() is a general allocator, but at least for allocations below
> maxBlockSize aset.c's doesn't do allocations in a way that really benefit
> from that *at all*. It's not a lot of work to do such allocations on our
> own.
> > > We e.g. could keep a larger number of memory blocks reserved
> > > ourselves. Possibly by delaying the release of additionally held blocks
> > > until we have been idle for a few seconds or such.
> >
> > I think keeping work_mem around after it has been used a couple times make
> > sense. This is the memory a user is willing to dedicate to operations,
> > after all.
>
> The biggest overhead of returning pages to the kernel is that that triggers
> zeroing the data during the next allocation. Particularly on multi-node
> servers that's surprisingly slow.  It's most commonly not the brk() or
> mmap() themselves that are the performance issue.
>
> Indeed, with your benchmark, I see that most of the time, on my dual Xeon
> Gold 5215 workstation, is spent zeroing newly allocated pages during page
> faults. That microarchitecture is worse at this than some others, but it's
> never free (or cache friendly).

I'm not sure I see the practical difference between those, but that's
interesting. Were you able to reproduce my results ?

> FWIW, in my experience trimming the brk()ed region doesn't work reliably
> enough in real world postgres workloads to be worth relying on (from a
> memory usage POV). Sooner or later you're going to have longer lived
> allocations placed that will prevent it from happening.

I'm not sure I follow: given our workload is clearly split at queries and
transactions boundaries, releasing memory at that time, I've assumed (and
noticed in practice, albeit not on a production system) that most memory at
the top of the heap would be trimmable as we don't keep much in between
queries / transactions.

>
> I have played around with telling aset.c that certain contexts are long
> lived and using mmap() for those, to make it more likely that the libc
> malloc/free can actually return memory to the system. I think that can be
> > quite worthwhile.

So if I understand your different suggestions, we should:
 - use mmap ourselves for what we deem to be "one-off" allocations, to make
sure that memory is not hanging around after we don't use
 - keep some pool allocated which will not be freed in between queries, but
reused for the next time we need it.

Thank you for looking at this problem.

Regards,

--
Ronan Dunklau






pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: harmonize password reuse in vacuumdb, clusterdb, and reindexdb
Next
From: Peter Eisentraut
Date:
Subject: Re: pg_waldump: add test for coverage