Re: PATCH: two slab-like memory allocators - Mailing list pgsql-hackers
From | Petr Jelinek |
---|---|
Subject | Re: PATCH: two slab-like memory allocators |
Date | |
Msg-id | 885540a7-ac72-5a21-6dae-a4d5b50672da@2ndquadrant.com Whole thread Raw |
In response to | Re: PATCH: two slab-like memory allocators (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: PATCH: two slab-like memory allocators
|
List | pgsql-hackers |
On 25/09/16 22:17, Tomas Vondra wrote: > On 09/25/2016 08:48 PM, Petr Jelinek wrote: > >> Slab: >> In general it seems understandable, the initial description helps to >> understand what's happening well enough. >> >> One thing I don't understand however is why the freelist is both >> array and doubly linked list and why there is new implementation of >> said doubly linked list given that we have dlist. >> > > Hmm, perhaps that should be explained better. > > In AllocSet, we only have a global freelist of chunks, i.e. we have a > list of free chunks for each possible size (there's 11 sizes, starting > with 8 bytes and then doubling the size). So freelist[0] is a list of > free 8B chunks, freelist[1] is a list of free 16B chunks, etc. > > In Slab, the freelist has two levels - first there's a bitmap on each > block (which is possible, as the chunks have constant size), tracking > which chunks of that particular block are free. This makes it trivial to > check that all chunks on the block are free, and free the whole block > (which is impossible with AllocSet). > > Second, the freelist at the context level tracks blocks with a given > number of free chunks - so freelist[0] tracks completely full blocks, > freelist[1] is a list of blocks with 1 free chunk, etc. This is used to > reuse space on almost full blocks first, in the hope that some of the > less full blocks will get completely empty (and freed to the OS). > > Is that clear? Ah okay, makes sense, the documentation of this could be improved then though as it's all squashed into single sentence that wasn't quite clear for me. > >>> +/* >>> + * SlabChunk >>> + * The prefix of each piece of memory in an SlabBlock >>> + * >>> + * NB: this MUST match StandardChunkHeader as defined by >>> utils/memutils.h. >>> + */ >> >> Is this true? Why? And if it is then why doesn't the SlabChunk >> actually match the StandardChunkHeader? > > It is true, a lot of stuff in MemoryContext infrastructure relies on > that. For example when you do pfree(ptr), we actually do something like > > header = (StandardChunkHeader*)(ptr - sizeof(StandardChunkHeader)) > > to get the chunk header - which includes pointer to the memory context > and other useful stuff. > > This also means we can put additional fields before StandardChunkHeader > as that does not break this pointer arithmetic, i.e. SlabChunkData is > effectively defined like this: > > typedef struct SlabChunkData > { > /* block owning this chunk */ > void *block; > > /* standard header */ > StandardChunkHeader header; > } SlabChunkData; > Yes but your struct then does not match StandardChunkHeader exactly so it should be explained in more detail (the aset.c where this comment is also present has struct that matches StandardChunkHeader so it's sufficient there). >> >>> +static void >>> +SlabCheck(MemoryContext context) >>> +{ >>> + /* FIXME */ >>> +} >> >> Do you plan to implement this interface? >> > > Yes, although I'm not sure what checks should go there. The only thing I > can think of right now is checking that the number of free chunks on a > block (according to the bitmap) matches the freelist index. > Yeah this context does not seem like it needs too much checking. The freelist vs free chunks check sounds ok to me. I guess GenSlab then will call the checks for the underlying contexts. >>> +#define SLAB_DEFAULT_BLOCK_SIZE 8192 >>> +#define SLAB_LARGE_BLOCK_SIZE (8 * 1024 * 1024) >> >> I am guessing this is based on max_cached_tuplebufs? Maybe these >> could be written with same style? >> > > Not sure I understand what you mean by "based on"? I don't quite > remember how I came up with those constants, but I guess 8kB and 8MB > seemed like good values. > > Also, what style you mean? I've used the same style as for ALLOCSET_* > constants in the same file. > I mean using 8 * 1024 for SLAB_DEFAULT_BLOCK_SIZE so that it's more readable. ALLOCSET_* does that too (with exception of ALLOCSET_SEPARATE_THRESHOLD which I have no idea why it's different from rest of the code). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: