Re: copy.c allocation constant - Mailing list pgsql-hackers

From Andres Freund
Subject Re: copy.c allocation constant
Date
Msg-id 20180124195549.n5j4qxzqzf5p2g74@alap3.anarazel.de
Whole thread Raw
In response to Re: copy.c allocation constant  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: copy.c allocation constant
List pgsql-hackers
On 2018-01-24 14:25:37 -0500, Robert Haas wrote:
> On Wed, Jan 24, 2018 at 1:43 PM, Andres Freund <andres@anarazel.de> wrote:
> > Indeed. Don't think RAW_BUF_SIZE is quite big enough for that on most
> > platforms though. From man mallopt:
> >  Balancing  these  factors  leads  to a default setting of 128*1024 for the M_MMAP_THRESHOLD parameter.
> > Additionally, even when malloc() chooses to use mmap() to back an
> > allocation, it'll still needs a header to know the size of the
> > allocation and such. So exactly using a size of a multiple of 4KB will
> > still leave you with wasted space.  Due to the latter I can't see it
> > mattering whether or not we add +1 to a power-of-two size.
> 
> Well, it depends on how it works.  dsa_allocate, for example, never
> adds a header to the size of the allocation.

glibc's malloc does add a header. My half-informed suspicion is that
most newer malloc backing allocators will have a header, because
maintaining a shared lookup-by-address table is pretty expensive to
maintain. A bit of metadata indicating size and/or source of the
allocation makes using thread-local information a lot easier.


> Allocations < 8kB are
> bucketed by size class and stored in superblocks carved up into
> equal-sized chunks.  Allocations > 8kB are rounded to a multiple of
> the 4kB page size and we grab that many consecutive free pages.  I
> didn't make those behaviors up; I copied them from elsewhere.  Some
> other allocator I read about did small-medium-large allocations: large
> with mmap(), medium with multiples of the page size, small with
> closely-spaced size classes.

Sure - all I'm trying to say that it likely won't matter whether we use
power-of-two or power-of-two + 1, because it seems likely that due to
overhead considerations we'll likely not quite fit into a size class
anyway.


> It doesn't seem like a particularly good idea to take a 64kB+1 byte
> allocation, stick a header on it, and pack it tightly up against other
> allocations on both sides.  Seems like that could lead to
> fragmentation problems.  Is that really what it does?

No, I'm fairly sure it's not.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Add parallel-aware hash joins.