Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

From Ronan Dunklau
Subject Re: Use generation context to speed up tuplesorts
Date
Msg-id 7285172.GXAFRqVoOG@aivenronan
Whole thread Raw
In response to Re: Use generation context to speed up tuplesorts  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Use generation context to speed up tuplesorts
Re: Use generation context to speed up tuplesorts
List pgsql-hackers
Le jeudi 16 décembre 2021, 18:00:56 CET Tomas Vondra a écrit :
> On 12/16/21 17:03, Ronan Dunklau wrote:
> > Le jeudi 16 décembre 2021, 11:56:15 CET Ronan Dunklau a écrit :
> >> I will follow up with a benchmark of the test sorting a table with a
> >> width
> >> varying from 1 to 32 columns.
> >
> > So please find attached another benchmark for that case.
> >
> > The 3 different patchsets tested are:
> >   - master
> >   - fixed (David's original patch)
> >   - adjust (Thomas growing blocks patch)
>
> Presumably Thomas is me, right?

I'm really sorry for this typo... Please accept my apologies.

>
> > So it looks like tuning malloc for this would be very benificial for any
> > kind of allocation, and by doing so we reduce the problems seen with the
> > growing blocks patch to next to nothing, while keeping the ability to not
> > allocate too much memory from the get go.
>
> Thanks for running those tests and investigating the glibc behavior! I
> find those results very interesting. My conclusions from this is that
> the interaction interaction between "our" allocator and the allocator in
> malloc (e.g. glibc) can be problematic. Which makes benchmarking and
> optimization somewhat tricky because code changes may trigger behavior
> change in glibc (or whatever allocator backs malloc).
>
> I think it's worth exploring if we can tune this in a reasonable way,
> but I have a couple concerns related to that:
>
> 1) I wonder how glibc-specific this is - I'd bet it applies to other
> allocators (either on another OS or just different allocator on Linux)
> too. Tweaking glibc parameters won't affect those systems, of course,
> but maybe we should allow tweaking those systems too ...

I agree, finding their specific problems and see if we can workaround it would
be interesting. I suppose glibc's malloc is the most commonly used allocator
in production, as it is the default for most Linux distributions.

>
> 2) In fact, I wonder if different glibc versions behave differently?
> Hopefully it's not changing that much, though. Ditto kernel versions,
> but the mmap/sbrk interface is likely more stable. We can test this.

That could be tested, yes. As a matter of fact, a commit removing the upper
limit for MALLOC_MMAP_THRESHOLD has just been committed yesterday to glibc,
which means we can service much bigger allocation without mmap.


>
> 3) If we bump the thresholds, won't that work against reusing the
> memory? I mean, if we free a whole block (from any allocator we have),
> glibc might return it to kernel, depending on mmap threshold value. It's
> not guaranteed, but increasing the malloc thresholds will make that even
> less likely. So we might just as well increase the minimum block size,
> with about the same effect, no?

It is my understanding that malloc will try to compact memory by moving it
around. So the memory should be actually be released to the kernel at some
point. In the meantime, malloc can reuse it for our next invocation (which can
be in a different memory context on our side).

If we increase the minimum block size, this is memory we will actually
reserve, and it will not protect us against the ramping-up behaviour:
 - the first allocation of a big block may be over mmap_threshold, and serviced
by an expensive mmap
 - when it's free, the threshold is doubled
 - next invocation is serviced by an sbrk call
 - freeing it will be above the trim threshold, and it will be returned.

After several "big" allocations, the thresholds will raise to their maximum
values (well, it used to, I need to check what happens with that latest patch
of glibc...)

This will typically happen several times as malloc doubles the threshold each
time. This is probably the reason quadrupling the block sizes was more
effective.


>
> > I would like to try to implement some dynamic glibc malloc tuning, if that
> > is something we don't reject on principle from the get go.
>
> +1 to that

Ok, I'll work on a patch for this and submit a new thread.


--
Ronan Dunklau





pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: pg_dump versus ancient server versions
Next
From: Ajin Cherian
Date:
Subject: Re: row filtering for logical replication