Re: define pg_structiszero(addr, s, r) - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: define pg_structiszero(addr, s, r) |
Date | |
Msg-id | ZyR2Tzdmj9c6E/QB@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: define pg_structiszero(addr, s, r) (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: define pg_structiszero(addr, s, r)
|
List | pgsql-hackers |
Hi, On Fri, Nov 01, 2024 at 05:45:44PM +1300, David Rowley wrote: > On Thu, 31 Oct 2024 at 19:17, Michael Paquier <michael@paquier.xyz> wrote: > > Seems kind of OK seen from here. I am wondering if others more > > comments about the name of the macro, its location, the fact that we > > still have pagebytes in bufpage.c, etc. > > This change looks incorrect: > > @@ -126,18 +124,9 @@ PageIsVerifiedExtended(Page page, BlockNumber > blkno, int flags) > } > > /* Check all-zeroes case */ > - all_zeroes = true; > pagebytes = (size_t *) page; > - for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++) > - { > - if (pagebytes[i] != 0) > - { > - all_zeroes = false; > - break; > - } > - } > > - if (all_zeroes) > + if (pg_memory_is_all_zeros(pagebytes, (BLCKSZ / sizeof(size_t)))) > return true; > > I think this should be passing BLCKSZ rather than (BLCKSZ / > sizeof(size_t)), otherwise, it'll just check the first 1 kilobyte is > zero rather than the entire page. Thanks for looking at it! Nice catch, indeed by using the new function we are changing the pointer arithmetic here and then we should pass BLCKSZ instead. > However, I've also performance concerns as if I look at the assembly > of PageIsVerifiedExtended, I see the zero checking is now done 1 byte > at a time: > > (gcc 11.4) > > leaq 1024(%rbx), %rdx <-- 1KB bug > .p2align 4,,10 > .p2align 3 > .L60: > cmpb $0, (%rax) <-- check 1 byte is zero. > jne .L59 > addq $1, %rax <-- increment by 1 byte. > cmpq %rax, %rdx <-- check if we've done 1024 bytes yet. > jne .L60 > > Whereas previously it was doing: > > movq %rbx, %rax > leaq 8192(%rbx), %rdx <-- 8KB > jmp .L60 > .p2align 4,,10 > .p2align 3 > .L90: > addq $8, %rax <-- increment by 8 bytes (or sizeof(size_t)) > cmpq %rax, %rdx > je .L61 > .L60: > cmpq $0, (%rax) <-- checks an entire 8 bytes are zero. > > I didn't test how performance-critical this is, but the header comment > for this function does use the words "cheaply detect". > > * This is called when a page has just been read in from disk. The idea is > * to cheaply detect trashed pages before we go nuts following bogus line > * pointers, testing invalid transaction identifiers, etc. > > so it seems a bit dangerous to switch this loop to byte-at-a-time > rather than doing 8 bytes at once without testing the performance > isn't affected. Agree, I did a quick test (see [0]) and it looks like it's significantly slower using the new inline function. We could try to write a more elaborate version of pg_memory_is_all_zeros(), but as it looks like there is only one use case, then it's probably better to not implement (revert) this change here and "just" add a comment as to why pg_memory_is_all_zeros() should not be used here, thoughts? [0]: https://godbolt.org/z/xqnW4MPY5 Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: