Re: Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Vacuum: allow usage of more than 1GB of work mem
Date
Msg-id 8e5cbf08-5dd8-466d-9271-562fc65f133f@iki.fi
Whole thread Raw
In response to Re: Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
On 16/07/18 18:35, Claudio Freire wrote:
> On Mon, Jul 16, 2018 at 11:34 AM Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Fri, Jul 13, 2018 at 5:43 PM Andrew Dunstan
>> <andrew.dunstan@2ndquadrant.com> wrote:
>>> On 07/13/2018 09:44 AM, Heikki Linnakangas wrote:
>>>> Claudio raised a good point, that doing small pallocs leads to
>>>> fragmentation, and in particular, it might mean that we can't give
>>>> back the memory to the OS. The default glibc malloc() implementation
>>>> has a threshold of 4 or 32 MB or something like that - allocations
>>>> larger than the threshold are mmap()'d, and can always be returned to
>>>> the OS. I think a simple solution to that is to allocate larger
>>>> chunks, something like 32-64 MB at a time, and carve out the
>>>> allocations for the nodes from those chunks. That's pretty
>>>> straightforward, because we don't need to worry about freeing the
>>>> nodes in retail. Keep track of the current half-filled chunk, and
>>>> allocate a new one when it fills up.
>>>
>>> Google seems to suggest the default threshold is much lower, like 128K.
>>> Still, making larger allocations seems sensible. Are you going to work
>>> on that?
>>
>> Below a few MB the threshold is dynamic, and if a block bigger than
>> 128K but smaller than the higher threshold (32-64MB IIRC) is freed,
>> the dynamic threshold is set to the size of the freed block.
>>
>> See M_MMAP_MAX and M_MMAP_THRESHOLD in the man page for mallopt[1]
>>
>> So I'd suggest allocating blocks bigger than M_MMAP_MAX.
>>
>> [1] http://man7.org/linux/man-pages/man3/mallopt.3.html
> 
> Sorry, substitute M_MMAP_MAX with DEFAULT_MMAP_THRESHOLD_MAX, the
> former is something else.

Yeah, we basically want to be well above whatever the threshold is. I 
don't think we should try to check for any specific constant, just make 
it large enough. Different libc implementations might have different 
policies, too. There's little harm in overshooting, and making e.g. 64 
MB allocations when 1 MB would've been enough to trigger the mmap() 
behavior. It's going to be more granular than the current situation, 
anyway, where we do a single massive allocation.

(A code comment to briefly mention the thresholds on common platforms 
would be good, though).

- Heikki


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Make foo=null a warning by default.
Next
From: David Fetter
Date:
Subject: Re: Make foo=null a warning by default.