Re: Add GUC to tune glibc's malloc implementation. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Add GUC to tune glibc's malloc implementation.
Date
Msg-id 20230627181746.qz4373pbobkto3un@awork3.anarazel.de
Whole thread Raw
In response to Re: Add GUC to tune glibc's malloc implementation.  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: Add GUC to tune glibc's malloc implementation.
List pgsql-hackers
Hi,

On 2023-06-27 08:35:28 +0200, Ronan Dunklau wrote:
> Le lundi 26 juin 2023, 23:03:48 CEST Andres Freund a écrit :
> > Hi,
> >
> > On 2023-06-26 08:38:35 +0200, Ronan Dunklau wrote:
> > > I hope what I'm trying to achieve is clearer that way. Maybe this patch is
> > > not the best way to go about this, but since the memory allocator
> > > behaviour can have such an impact it's a bit sad we have to leave half
> > > the performance on the table because of it when there are easily
> > > accessible knobs to avoid it.
> > I'm *quite* doubtful this patch is the way to go.  If we want to more
> > tightly control memory allocation patterns, because we have more
> > information than glibc, we should do that, rather than try to nudge glibc's
> > malloc in random direction.  In contrast a generic malloc() implementation
> > we can have much more information about memory lifetimes etc due to memory
> > contexts.
>
> Yes this is probably much more appropriate, but a much larger change with
> greater risks of regression. Especially as we have to make sure we're not
> overfitting our own code for a specific malloc implementation, to the detriment
> of others.

I think your approach is fundamentally overfitting our code to a specific
malloc implementation, in a way that's not tunable by mere mortals. It just
seems like a dead end to me.


> Except if you hinted we should write our own directly instead ?

I don't think we should write our own malloc - we don't rely on it much
ourselves. And if we replace it, we need to care about mallocs performance
characteristics a whole lot, because various libraries etc do heavily rely on
it.

However, I do think we should eventually avoid using malloc() for aset.c et
al. malloc() is a general allocator, but at least for allocations below
maxBlockSize aset.c's doesn't do allocations in a way that really benefit from
that *at all*. It's not a lot of work to do such allocations on our own.


> > We e.g. could keep a larger number of memory blocks reserved
> > ourselves. Possibly by delaying the release of additionally held blocks
> > until we have been idle for a few seconds or such.
>
> I think keeping work_mem around after it has been used a couple times make
> sense. This is the memory a user is willing to dedicate to operations, after
> all.

The biggest overhead of returning pages to the kernel is that that triggers
zeroing the data during the next allocation. Particularly on multi-node
servers that's surprisingly slow.  It's most commonly not the brk() or mmap()
themselves that are the performance issue.

Indeed, with your benchmark, I see that most of the time, on my dual Xeon Gold
5215 workstation, is spent zeroing newly allocated pages during page
faults. That microarchitecture is worse at this than some others, but it's
never free (or cache friendly).


> > WRT to the difference in TPS in the benchmark you mention - I suspect that
> > we are doing something bad that needs to be improved regardless of the
> > underlying memory allocator implementation.  Due to the lack of detailed
> > instructions I couldn't reproduce the results immediately.
>
> I re-attached the simple script I used. I've run this script with different
> values for glibc_malloc_max_trim_threshold.

FWIW, in my experience trimming the brk()ed region doesn't work reliably
enough in real world postgres workloads to be worth relying on (from a memory
usage POV). Sooner or later you're going to have longer lived allocations
placed that will prevent it from happening.

I have played around with telling aset.c that certain contexts are long lived
and using mmap() for those, to make it more likely that the libc malloc/free
can actually return memory to the system. I think that can be quite
worthwhile.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: ReadRecentBuffer() doesn't scale well
Next
From: Andrew Dunstan
Date:
Subject: Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs