Re: slab allocator performance issues - Mailing list pgsql-hackers

From David Rowley
Subject Re: slab allocator performance issues
Date
Msg-id CAApHDvoxVxFN0DXYyn6tDdg6s7wx2sVrVJ_JSCZxrfd-s86j8Q@mail.gmail.com
Whole thread Raw
In response to Re: slab allocator performance issues  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: slab allocator performance issues  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers
On Sat, 11 Sept 2021 at 09:07, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> I've been investigating the regressions in some of the benchmark
> results, together with the generation context benchmarks [1].

I've not looked into the regression you found with this yet, but I did
rebase the patch.  slab.c has seen quite a number of changes recently.

I didn't spend a lot of time checking over the patch. I mainly wanted
to see what the performance was like before reviewing in too much
detail.

To test the performance, I used [1] and ran:

select pg_allocate_memory_test(<nbytes>, 1024*1024,
10::bigint*1024*1024*1024, 'slab');

that basically allocates chunks of <nbytes> and keeps around 1MB of
them at a time and allocates a total of 10GBs of them.

I saw:

Master:
16 byte chunk = 8754.678 ms
32 byte chunk = 4511.725 ms
64 byte chunk = 2244.885 ms
128 byte chunk = 1135.349 ms
256  byte chunk = 548.030 ms
512 byte chunk = 272.017 ms
1024 byte chunk = 144.618 ms

Master + attached patch:
16 byte chunk = 5255.974 ms
32 byte chunk = 2640.807 ms
64 byte chunk = 1328.949 ms
128 byte chunk = 668.078 ms
256 byte chunk = 330.564 ms
512 byte chunk = 166.844 ms
1024 byte chunk = 85.399 ms

So patched runs in about 60% of the time that master runs in.

I plan to look at the patch in a bit more detail and see if I can
recreate and figure out the regression that Tomas reported. For now, I
just want to share the rebased patch.

The only thing I really adjusted from Andres' version is to instead of
using pointers for the linked list block freelist, I made it store the
number of bytes into the block that the chunk is.  This means we can
use 4 bytes instead of 8 bytes for these pointers.  The block size is
limited to 1GB now anyway, so 32-bit is large enough for these
offsets.

David

[1] https://www.postgresql.org/message-id/attachment/137056/allocate_performance_functions.patch.txt

Attachment

pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: make_ctags: use -I option to ignore pg_node_attr macro
Next
From: Peter Smith
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply