Re: slab allocator performance issues - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: slab allocator performance issues |
Date | |
Msg-id | CAApHDvq6eUdLJxAUdSmukGiiTQNT79cNtntL=3FE52T_AP3XDQ@mail.gmail.com Whole thread Raw |
In response to | Re: slab allocator performance issues (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: slab allocator performance issues
|
List | pgsql-hackers |
I've spent quite a bit more time on the slab changes now and I've attached a v3 patch. One of the major things that I've spent time on is benchmarking this. I'm aware that Tomas wrote some functions to benchmark. I've taken those and made some modifications to allow the memory context type to be specified as a function parameter. This allows me to easily compare the performance of slab with both aset and generation. Another change that I made to Tomas' module was how the random ordering part works. What I wanted was the ability to specify how randomly to pfree the chunks and test various "degrees-of-randomness" to see how that affects the performance. What I ended up coming up with was the ability to specify the number of "random segments". This controls how many groups we split all allocated chunks into to randomise. If there is 1 random segment, then that's just randomising over all chunks. If there are 10 random segments, then we split the array of allocated chunks into 10 portions based on either FIFO or LIFO order, then randomise the order of the chunks only within each of those segments. This allows us to test FIFO/LIFO allocation patterns with and without random and any degrees of that in between. If the random segments is set to 0, then no randomisation is done. Another change I made to Tomas' code was, I'm now using palloc0() instead of palloc() and I'm also checking the first byte of the allocated chunk is '\0' before pfreeing it. What I was finding was that pfree was showing as highly dominant in perf output due to it having to deference the MemoryChunk to find the context-type bits. pfree had to do this as none of the calling code had touched any of the memory in the chunk. I felt it was unrealistic to be pallocing memory and not doing anything with it and then pfreeing it without having done anything with it. Mostly this just moves the responsibilities around of which function is penalised in having to load the cache line. I mostly did this as I was struggling to make any sense of perf's output. I've attached alloc_bench_contrib.patch which I used for testing. I've also attached a spreadsheet with the benchmark results. The general summary from having done those is that slab is now generally now on-par with aset in terms of palloc performance. Previously slab was performing at about half the speed of aset unless CPU cache pressure became more significant, in which case the performance is dominated by fetching cache lines from RAM. However, the new code still makes meaningful improvements even under heavy CPU cache pressure. When it comes to pfree performance, the updated slab code is much faster than it was previously, but not quite on-par with aset or generation. The attached spreadsheet is broken down into 3 tabs. Each tab is testing a chunk size and a fixed total number of chunks allocated at once. Within each tab, I'm testing FIFO and then LIFO allocation patterns each with a different degree of randomness introduced, as I described above. In none of the tests was the patched version slower than the unpatched version. One pending question I had was about SlabStats where we list free chunks. Since we now have a list of emptyblocks, I wasn't too sure if the chunks from those should be included in that total. I currently am not including them, but I have added some additional information to list the number of completely empty blocks that we've got in the emptyblocks list. Some follow-up work that I'm thinking is a good idea: 1. Reduce the SlabContext's chunkSize, fullChunkSize and blockSize fields from Size down to uint32. These have no need to be 64 bits. We don't allow slab blocks over 1GB since c6e0fe1f2. I thought of doing this separately as we might need to rationalise the equivalent fields in aset.c and generation.c. Those can have external chunks, so I'm not 100% sure if we should do that there or not yet. I just didn't want to touch those files in this effort. 2. Slab should probably gain the ability to grow the block size as aset and generation both do. Since the performance of the slab context is good now, we might want to use it for hash join's 32kb chunks, but I doubt we can without the block size growth. I'm planning on pushing the attached v3 patch shortly. I've spent several days reading over this and testing it in detail along with adding additional features to the SlabCheck code to find more inconsistencies. David
Attachment
pgsql-hackers by date: