Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock - Mailing list pgsql-hackers

From Andrey M. Borodin
Subject Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date
Msg-id 4D5FCD99-1BAC-4353-942F-B022457A3D80@yandex-team.ru
Whole thread Raw
In response to Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
List pgsql-hackers

> On 6 Nov 2023, at 09:09, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
>
>> Having hashtable to find SLRU page in the buffer IMV is too slow. Some comments on this approach can be found here
[0].
>> I'm OK with having HTAB for that if we are sure performance does not degrade significantly, but I really doubt this
isthe case. 
>> I even think SLRU buffers used HTAB in some ancient times, but I could not find commit when it was changed to linear
search.
>
> The main intention of having this buffer mapping hash is to find the
> SLRU page faster than sequence search when banks are relatively bigger
> in size, but if we find the cases where having hash creates more
> overhead than providing gain then I am fine to remove the hash because
> the whole purpose of adding hash here to make the lookup faster.  So
> far in my test I did not find the slowness.  Do you or anyone else
> have any test case based on the previous research on whether it
> creates any slowness?
PFA test benchmark_slru_page_readonly(). In this test we run SimpleLruReadPage_ReadOnly() (essential part of
TransactionIdGetStatus())
before introducing HTAB for buffer mapping I get
Time: 14837.851 ms (00:14.838)
with buffer HTAB I get
Time: 22723.243 ms (00:22.723)

This hash table makes getting transaction status ~50% slower.

Benchmark script I used:
make -C $HOME/postgresMX -j 8 install && (pkill -9 postgres; rm -rf test; ./initdb test && echo
"shared_preload_libraries= 'test_slru'">> test/postgresql.conf && ./pg_ctl -D test start && ./psql -c 'create extension
test_slru'postgres && ./pg_ctl -D test restart && ./psql -c "SELECT count(test_slru_page_write(a, 'Test SLRU')) 
  FROM generate_series(12346, 12393, 1) as a;" -c '\timing' -c "SELECT benchmark_slru_page_readonly(12377);" postgres)

>
>> Maybe we could decouple locks and counters from SLRU banks? Banks were meant to be small to exploit performance of
locallinear search. Lock partitions have to be bigger for sure. 
>
> Yeah, that could also be an idea if we plan to drop the hash.  I mean
> bank-wise counter is fine as we are finding a victim buffer within a
> bank itself, but each lock could cover more slots than one bank size
> or in other words, it can protect multiple banks.  Let's hear more
> opinion on this.
+1

>
>>
>> On 30 Oct 2023, at 09:20, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>>
>> I have taken 0001 and 0002 from [1], done some bug fixes in 0001
>>
>>
>> BTW can you please describe in more detail what kind of bugs?
>
> Yeah, actually that patch was using the same GUC
> (multixact_offsets_buffers) in SimpleLruInit for MultiXactOffsetCtl as
> well as for  MultiXactMemberCtl, see the below patch snippet from the
> original patch.

Ouch. We were running this for serveral years with this bug... Thanks!


Best regards, Andrey Borodin.




Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
Next
From: Richard Guo
Date:
Subject: Re: Compiling warnings on old GCC