Re: BufferAlloc: don't take two simultaneous locks - Mailing list pgsql-hackers
From | Yura Sokolov |
---|---|
Subject | Re: BufferAlloc: don't take two simultaneous locks |
Date | |
Msg-id | c6f11bacdab7f7ef0891dec0823759807038a91f.camel@postgrespro.ru Whole thread Raw |
In response to | Re: BufferAlloc: don't take two simultaneous locks (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: BufferAlloc: don't take two simultaneous locks
|
List | pgsql-hackers |
В Чт, 17/03/2022 в 12:02 +0900, Kyotaro Horiguchi пишет: > At Wed, 16 Mar 2022 14:11:58 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in > > В Ср, 16/03/2022 в 12:07 +0900, Kyotaro Horiguchi пишет: > > > At Tue, 15 Mar 2022 13:47:17 +0300, Yura Sokolov <y.sokolov@postgrespro.ru> wrote in > > > In v7, HASH_ENTER returns the element stored in DynaHashReuse using > > > the freelist_idx of the new key. v8 uses that of the old key (at the > > > time of HASH_REUSE). So in the case "REUSE->ENTER(elem exists and > > > returns the stashed)" case the stashed element is returned to its > > > original partition. But it is not what I mentioned. > > > > > > On the other hand, once the stahsed element is reused by HASH_ENTER, > > > it gives the same resulting state with HASH_REMOVE->HASH_ENTER(borrow > > > from old partition) case. I suspect that ththat the frequent freelist > > > starvation comes from the latter case. > > > > Doubtfully. Due to probabilty theory, single partition doubdfully > > will be too overflowed. Therefore, freelist. > > Yeah. I think so generally. > > > But! With 128kb shared buffers there is just 32 buffers. 32 entry for > > 32 freelist partition - certainly some freelist partition will certainly > > have 0 entry even if all entries are in freelists. > > Anyway, it's an extreme condition and the starvation happens only at a > neglegible ratio. > > > > RETURNED: 2 > > > ALLOCED: 0 > > > BORROWED: 435 > > > REUSED: 495444 > > > ASSIGNED: 495467 (-23) > > > > > > Now "BORROWED" happens 0.8% of REUSED > > > > 0.08% actually :) > > Mmm. Doesn't matter:p > > > > > > > I lost access to Xeon 8354H, so returned to old Xeon X5675. > > > > > ... > > > > > > Strange thing: both master and patched version has higher > > > > > > peak tps at X5676 at medium connections (17 or 27 clients) > > > > > > than in first october version [1]. But lower tps at higher > > > > > > connections number (>= 191 clients). > > > > > > I'll try to bisect on master this unfortunate change. > ... > > I've checked. Looks like something had changed on the server, since > > old master commit behaves now same to new one (and differently to > > how it behaved in October). > > I remember maintainance downtime of the server in november/december. > > Probably, kernel were upgraded or some system settings were changed. > > One thing I have a little concern is that numbers shows 1-2% of > degradation steadily for connection numbers < 17. > > I think there are two possible cause of the degradation. > > 1. Additional branch by consolidating HASH_ASSIGN into HASH_ENTER. > This might cause degradation for memory-contended use. > > 2. nallocs operation might cause degradation on non-shared dynahasyes? > I believe doesn't but I'm not sure. > > On a simple benchmarking with pgbench on a laptop, dynahash > allocation (including shared and non-shared) happend about at 50 > times per second with 10 processes and 200 with 100 processes. > > > > I don't think nalloced needs to be the same width to long. For the > > > platforms with 32-bit long, anyway the possible degradation if any by > > > 64-bit atomic there doesn't matter. So don't we always define the > > > atomic as 64bit and use the pg_atomic_* functions directly? > > > > Some 32bit platforms has no native 64bit atomics. Then they are > > emulated with locks. > > > > Well, and for 32bit platform long is just enough. Why spend other > > 4 bytes per each dynahash? > > I don't think additional bytes doesn't matter, but emulated atomic > operations can matter. However I'm not sure which platform uses that > fallback implementations. (x86 seems to have __sync_fetch_and_add() > since P4). > > My opinion in the previous mail is that if that level of degradation > caued by emulated atomic operations matters, we shouldn't use atomic > there at all since atomic operations on the modern platforms are not > also free. > > In relation to 2 above, if we observe that the degradation disappears > by (tentatively) use non-atomic operations for nalloced, we should go > back to the previous per-freelist nalloced. Here is version with nalloced being union of appropriate atomic and long. ------ regards Yura Sokolov
Attachment
pgsql-hackers by date: