Re: slab allocator performance issues - Mailing list pgsql-hackers

From Andres Freund
Subject Re: slab allocator performance issues
Date
Msg-id 20210717195307.7hsif32kkty4jnwq@alap3.anarazel.de
Whole thread Raw
In response to slab allocator performance issues  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2021-07-17 12:43:33 -0700, Andres Freund wrote:
> 2) SlabChunkIndex() in SlabFree() is slow. It requires a 64bit division, taking
> up ~50% of the cycles in SlabFree(). A 64bit div, according to [1] , has a
> latency of 35-88 cycles on skylake-x (and a reverse throughput of 21-83,
> i.e. no parallelism). While it's getting a bit faster on icelake / zen 3, it's
> still slow enough there to be very worrisome.
> 
> I don't see a way to get around the division while keeping the freelist
> structure as is. But:
> 
> ISTM that we only need the index because of the free-chunk list, right? Why
> don't we make the chunk list use actual pointers? Is it concern that that'd
> increase the minimum allocation size?  If so, I see two ways around that:
> First, we could make the index just the offset from the start of the block,
> that's much cheaper to calculate. Second, we could store the next pointer in
> SlabChunk->slab/block instead (making it a union) - while on the freelist we
> don't need to dereference those, right?
> 
> I suspect both would also make the block initialization a bit cheaper.
> 
> That should also accelerate SlabBlockGetChunk(), which currently shows up as
> an imul, which isn't exactly fast either (and uses a lot of execution ports).

Oh - I just saw that effectively the allocation size already is a
uintptr_t at minimum. I had only seen

    /* Make sure the linked list node fits inside a freed chunk */
    if (chunkSize < sizeof(int))
        chunkSize = sizeof(int);
but it's followed by
    /* chunk, including SLAB header (both addresses nicely aligned) */
    fullChunkSize = sizeof(SlabChunk) + MAXALIGN(chunkSize);

which means we are reserving enough space for a pointer on just about
any platform already? Seems we can just make that official and reserve
space for a pointer as part of the chunk size rounding up, instead of
fullChunkSize?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: slab allocator performance issues
Next
From: Yugo NAGATA
Date:
Subject: corruption of WAL page header is never reported