Re: WIP: bufmgr rewrite per recent discussions - Mailing list pgsql-patches
From | Mark Cave-Ayland |
---|---|
Subject | Re: WIP: bufmgr rewrite per recent discussions |
Date | |
Msg-id | 9EB50F1A91413F4FA63019487FCD251D113118@WEBBASEDDC.webbasedltd.local Whole thread Raw |
In response to | Re: WIP: bufmgr rewrite per recent discussions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: WIP: bufmgr rewrite per recent discussions
|
List | pgsql-patches |
Hi Tom, Here are the results (tps) from your second patch (LH column is including connection establishments, RH column is excluding connection establishments) using the same test, i.e. pgbench -s 10 -c 10 -t 1000 -d pgbench Shared_buffers 1000 10000 100000 204.909702 205.01051 345.098727 345.411606 375.812059 376.37741 195.100496 195.197463 348.791481 349.111363 314.718619 315.139878 199.637965 199.735195 313.561366 313.803225 365.061177 365.666103 195.935529 196.029082 325.893744 326.171754 370.040623 370.625072 196.661374 196.756481 314.468751 314.711517 319.643145 320.099164 Mean: 198.4490132 198.5457462 329.5628138 329.841893 349.0551246 349.5815254 Having a percentage of the shared_buffers scanned in each round means that no extra tweaking is required for higher values of shared_buffers with the default settings :) > Another thing that might be interesting on a multi-CPU > Opteron is to try to make the shared memory layout more > friendly to the CPU cache, which I believe uses 128-byte > cache lines. (Simon was planning to try some of these things > but I haven't heard back about results.) Things to try here include > > 1. Change ALIGNOF_BUFFER in src/include/pg_config_manual.h to > 128. This will require a full recompile I think. 2 and 3 > don't make any sense until after you do this. OK. > 2. Pad the BufferDesc struct (in src/include/storage/buf_internals.h) > out to be exactly 64 or 128 bytes. (64 would make it exactly > 2 buffer headers per cache line, so two CPUs would contend > only when working on a pair of adjacent headers. 128 would > mean no cross-header cache contention but of course it wastes > a lot more storage.) You need only recompile the files in > src/backend/storage/buffer/ after changing buf_internals.h. Here are the results with the padded BufferDesc structure. First here is the padding to 64 bytes: Shared_buffers 1000 10000 100000 206.862511 206.965854 302.316799 302.581089 317.357151 317.791769 198.881107 198.974454 352.982754 353.319523 368.020383 368.625353 200.66022 200.756237 319.80475 320.076327 369.440584 370.032709 202.076089 202.17038 304.278037 304.520488 309.897702 310.332232 204.511959 204.612334 314.043021 314.29964 318.424781 318.871094 Mean: 202.5983772 202.6958518 318.6850722 318.9594134 336.6281202 337.1306314 And here are the results padding BufferDesc to 128 bytes: Shared_buffers 1000 10000 100000 204.071342 204.177755 368.942576 369.298066 373.385305 374.040511 203.616738 203.717336 365.15145 365.508939 366.837804 367.487877 206.353662 206.451992 303.231566 303.491979 312.613215 313.086744 194.403251 194.497714 311.006837 311.250281 309.072588 309.536229 192.950395 193.040478 334.19558 334.476809 316.284982 316.776723 Mean: 200.2790776 200.377055 336.5056018 336.8052148 335.6387788 336.1856168 As I see it, there is not much noticeable performance gain (and maybe even a small loss)with the padding included. I suspect that since the drives are software RAID 1, better drives would be needed to try and benchmark this better. > 3. Pad the LWLock struct (in > src/backend/storage/lmgr/lwlock.c) to some power of 2 up to > 128 bytes --- same issue of space wasted versus cross-lock contention. Having seen the results above, is it still worth looking at this? Kind regards, Mark. ------------------------ WebBased Ltd South West Technology Centre Tamar Science Park Plymouth PL6 8BT T: +44 (0)1752 791021 F: +44 (0)1752 791023 W: http://www.webbased.co.uk
pgsql-patches by date: