Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock - Mailing list pgsql-hackers

From tender wang
Subject Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Date
Msg-id CAHewXNnXoTe1qyTnZNLt02-nNnt2-5LwGVwrBMJmON8oU6Gx3A@mail.gmail.com
Whole thread Raw
In response to Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
List pgsql-hackers


Andrey M. Borodin <x4mmm@yandex-team.ru> 于2023年12月14日周四 17:02写道:


> On 14 Dec 2023, at 08:12, Amul Sul <sulamul@gmail.com> wrote:
>
>
> + int bankno = pageno & ctl->bank_mask;
>
> I am a bit uncomfortable seeing it as a mask, why can't it be simply a number
> of banks (num_banks) and get the bank number through modulus op (pageno %
> num_banks) instead of bitwise & operation (pageno & ctl->bank_mask) which is a
> bit difficult to read compared to modulus op which is quite simple,
> straightforward and much common practice in hashing.
>
> Are there any advantages of using &  over % ?
 
use Compiler Explorer[1] tool, '%' has more Assembly instructions than '&' . 
int GetBankno1(int pageno) {
    return pageno & 127;
}

int GetBankno2(int pageno) {
    return pageno % 127;
}
under clang 13.0
GetBankno1:                             # @GetBankno1
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], edi
        mov     eax, dword ptr [rbp - 4]
        and     eax, 127
        pop     rbp
        ret
GetBankno2:                             # @GetBankno2
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], edi
        mov     eax, dword ptr [rbp - 4]
        mov     ecx, 127
        cdq
        idiv    ecx
        mov     eax, edx
        pop     rbp
        ret
under gcc 13.2
GetBankno1:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     eax, DWORD PTR [rbp-4]
        and     eax, 127
        pop     rbp
        ret
GetBankno2:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     eax, DWORD PTR [rbp-4]
        movsx   rdx, eax
        imul    rdx, rdx, -2130574327
        shr     rdx, 32
        add     edx, eax
        mov     ecx, edx
        sar     ecx, 6
        cdq
        sub     ecx, edx
        mov     edx, ecx
        sal     edx, 7
        sub     edx, ecx
        sub     eax, edx
        mov     ecx, eax
        mov     eax, ecx
        pop     rbp
        ret



The instruction AND is ~20 times faster than IDIV [0]. This is relatively hot function, worth sacrificing some readability to save ~ten nanoseconds on each check of a status of a transaction.

  Now that AND is more faster, Can we  replace the '% SLRU_MAX_BANKLOCKS' operation in  SimpleLruGetBankLock()  with '& 127' :
  SimpleLruGetBankLock()  
{
       int banklockno = (pageno & ctl->bank_mask) % SLRU_MAX_BANKLOCKS;
                                                                               use '&'
       return &(ctl->shared->bank_locks[banklockno].lock);
}
Thoughts?

[0] https://www.agner.org/optimize/instruction_tables.pdf

pgsql-hackers by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: [meson] expose buildtype debug/optimization info to pg_config
Next
From: "Andrey M. Borodin"
Date:
Subject: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock