Re: define pg_structiszero(addr, s, r) - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: define pg_structiszero(addr, s, r) |
Date | |
Msg-id | ZzmA24RVVBUhRVg8@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: define pg_structiszero(addr, s, r) (Ranier Vilela <ranier.vf@gmail.com>) |
List | pgsql-hackers |
Hi, On Sat, Nov 16, 2024 at 11:42:54AM -0300, Ranier Vilela wrote: > > Em sex., 15 de nov. de 2024 às 11:43, Bertrand Drouvot < > > bertranddrouvot.pg@gmail.com> escreveu: > > > >> while that should be: > >> > >> " > >> static inline bool > >> pg_memory_is_all_zeros_simd(const void *p, const void *end) > >> > > What I'm trying here, obviously, is a hack. > > If it works, and the compiler accepts it, it's ok for me. > > > > If this hack is safe and correct, I think that 204 times faster, > > it is very good, for a block size 8192. The "hack" is not correct, because it's doing: " static inline bool all_zeros_simd(const size_t *p, const size_t * end) { for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8) { if ((((size_t *) p)[0] != 0) | (((size_t *) p)[1] != 0) | (((size_t *) p)[2] != 0) | (((size_t *) p)[3] != 0) | (((size_t *) p)[4] != 0) | (((size_t *) p)[5] != 0) | (((size_t *) p)[6] != 0) | (((size_t *) p)[7] != 0)) return false; } . . " "p += sizeof(size_t) * 8" advances by 64 elements. But those elements are "size_t" elements (since you're using size_t pointers as the function arguments). Then instead of advancing by 64 bytes, you know advance by 512 bytes. But you only check 64 bytes per iteration -> you're missing 448 bytes to check per iteration. We can "visualize" this by adding a few output messages like: " static inline bool all_zeros_simd(const size_t *p, const size_t * end) { for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8) { printf("Current p: %p\n", (void*)p); printf("Checking elements:\n"); printf("[0]: %p = %zu\n", (void*)&((size_t *)p)[0], ((size_t *)p)[0]); printf("[1]: %p = %zu\n", (void*)&((size_t *)p)[1], ((size_t *)p)[1]); printf("[2]: %p = %zu\n", (void*)&((size_t *)p)[2], ((size_t *)p)[2]); printf("[3]: %p = %zu\n", (void*)&((size_t *)p)[3], ((size_t *)p)[3]); printf("[4]: %p = %zu\n", (void*)&((size_t *)p)[4], ((size_t *)p)[4]); printf("[5]: %p = %zu\n", (void*)&((size_t *)p)[5], ((size_t *)p)[5]); printf("[6]: %p = %zu\n", (void*)&((size_t *)p)[6], ((size_t *)p)[6]); printf("[7]: %p = %zu\n", (void*)&((size_t *)p)[7], ((size_t *)p)[7]); const size_t *next_p = p + sizeof(size_t) * 8; printf("Next p will be: %p (advance of %zu bytes)\n", (void*)next_p, (size_t)((char*)next_p - (char*)p)); . . . " Then we get things like: " Current p: 0x7fff2a93e500 Checking elements: [0]: 0x7fff2a93e500 = 0 [1]: 0x7fff2a93e508 = 0 [2]: 0x7fff2a93e510 = 0 [3]: 0x7fff2a93e518 = 0 [4]: 0x7fff2a93e520 = 0 [5]: 0x7fff2a93e528 = 0 [6]: 0x7fff2a93e530 = 0 [7]: 0x7fff2a93e538 = 0 Next p will be: 0x7fff2a93e700 (advance of 512 bytes) " Meaning that you're checking 64 bytes per iteration (for example from 0x500 to 0x508 is 8 bytes) while advancing by 512 bytes: you're missing 448 bytes to check per iteration. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: