Re: slab allocator performance issues - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: slab allocator performance issues |
Date | |
Msg-id | a5ccda91-d9fc-49c5-b3c7-c81528b938c5@enterprisedb.com Whole thread Raw |
In response to | Re: slab allocator performance issues (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: slab allocator performance issues
Re: slab allocator performance issues |
List | pgsql-hackers |
Hi, I've been investigating the regressions in some of the benchmark results, together with the generation context benchmarks [1]. Turns out it's pretty difficult to benchmark this, because the results strongly depend on what the backend did before. For example if I run slab_bench_fifo with the "decreasing" test for 32kB blocks and 512B chunks, I get this: select * from slab_bench_fifo(1000000, 32768, 512, 100, 10000, 5000); mem_allocated | alloc_ms | free_ms ---------------+----------+--------- 528547840 | 155394 | 87440 i.e. palloc() takes ~155ms and pfree() ~87ms (and these result are stable, the numbers don't change much with more runs). But if I run a set of "lifo" tests in the backend first, the results look like this: mem_allocated | alloc_ms | free_ms ---------------+----------+--------- 528547840 | 41728 | 71524 (1 row) so the pallocs are suddenly about ~4x faster. Clearly, what the backend did before may have pretty dramatic impact on results, even for simple benchmarks like this. Note: The benchmark was a single SQL script, running all the different workloads in the same backend. I did a fair amount of perf profiling, and the main difference between the slow and fast runs seems to be this: 0 page-faults:u 0 minor-faults:u 0 major-faults:u vs 20,634,153 page-faults:u 20,634,153 minor-faults:u 0 major-faults:u Attached is a more complete perf stat output, but the page faults seem to be the main issue. My theory is that in the "fast" case, the past backend activity puts the glibc memory management into a state that prevents page faults in the benchmark. But of course, this theory may be incomplete - for example it's not clear why running the benchmark repeatedly would not "condition" the backend the same way. But it doesn't - it's ~150ms even for repeated runs. Secondly, I'm not sure this explains why some of the timings actually got much slower with the 0003 patch, when the sequence of the steps is still the same. Of course, it's possible 0003 changes the allocation pattern a bit, interfering with glibc memory management. This leads to a couple of interesting questions, I think: 1) I've only tested this on Linux, with glibc. I wonder how it'd behave on other platforms, or with other allocators. 2) Which cases are more important? When the backend was warmed up, or when each benchmark runs in a new backend? It seems the "new backend" is something like a "worst case" leading to more page faults, so maybe that's the thing to watch. OTOH it's unlikely to have a completely new backend, so maybe not. 3) Can this teach us something about how to allocate stuff, to better "prepare" the backend for future allocations? For example, it's a bit strange that repeated runs of the same benchmark don't do the trick, for some reason. regards [1] https://www.postgresql.org/message-id/bcdd4e3e-c12d-cd2b-7ead-a91ad416100a%40enterprisedb.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: