Re: define pg_structiszero(addr, s, r) - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: define pg_structiszero(addr, s, r)
Date
Msg-id ZzmA24RVVBUhRVg8@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: define pg_structiszero(addr, s, r)  (Ranier Vilela <ranier.vf@gmail.com>)
List pgsql-hackers
Hi,

On Sat, Nov 16, 2024 at 11:42:54AM -0300, Ranier Vilela wrote:
> > Em sex., 15 de nov. de 2024 às 11:43, Bertrand Drouvot <
> > bertranddrouvot.pg@gmail.com> escreveu:
> >
> >> while that should be:
> >>
> >> "
> >> static inline bool
> >> pg_memory_is_all_zeros_simd(const void *p, const void *end)
> >>
> > What I'm trying here, obviously, is a hack.
> > If it works, and the compiler accepts it, it's ok for me.
> >
> > If this hack is safe and correct, I think that 204 times faster,
> > it is very good, for a block size 8192.

The "hack" is not correct, because it's doing:

"
static inline bool
all_zeros_simd(const size_t *p, const size_t * end)
{
    for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8)
    {
        if ((((size_t *) p)[0] != 0) | (((size_t *) p)[1] != 0) |
            (((size_t *) p)[2] != 0) | (((size_t *) p)[3] != 0) |
            (((size_t *) p)[4] != 0) | (((size_t *) p)[5] != 0) |
            (((size_t *) p)[6] != 0) | (((size_t *) p)[7] != 0))
            return false;
    }
.
.
"

"p += sizeof(size_t) * 8" advances by 64 elements. But those elements are "size_t"
elements (since you're using size_t pointers as the function arguments).

Then instead of advancing by 64 bytes, you know advance by 512 bytes. But
you only check 64 bytes per iteration -> you're missing 448 bytes to check
per iteration.

We can "visualize" this by adding a few output messages like:

"
static inline bool
all_zeros_simd(const size_t *p, const size_t * end)
{
    for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8)
    {
        printf("Current p: %p\n", (void*)p);
        printf("Checking elements:\n");
        printf("[0]: %p = %zu\n", (void*)&((size_t *)p)[0], ((size_t *)p)[0]);
        printf("[1]: %p = %zu\n", (void*)&((size_t *)p)[1], ((size_t *)p)[1]);
        printf("[2]: %p = %zu\n", (void*)&((size_t *)p)[2], ((size_t *)p)[2]);
        printf("[3]: %p = %zu\n", (void*)&((size_t *)p)[3], ((size_t *)p)[3]);
        printf("[4]: %p = %zu\n", (void*)&((size_t *)p)[4], ((size_t *)p)[4]);
        printf("[5]: %p = %zu\n", (void*)&((size_t *)p)[5], ((size_t *)p)[5]);
        printf("[6]: %p = %zu\n", (void*)&((size_t *)p)[6], ((size_t *)p)[6]);
        printf("[7]: %p = %zu\n", (void*)&((size_t *)p)[7], ((size_t *)p)[7]);

        const size_t *next_p = p + sizeof(size_t) * 8;
        printf("Next p will be: %p (advance of %zu bytes)\n",
               (void*)next_p,
               (size_t)((char*)next_p - (char*)p));
.
.
.
"

Then we get things like:

"
Current p: 0x7fff2a93e500
Checking elements:
[0]: 0x7fff2a93e500 = 0
[1]: 0x7fff2a93e508 = 0
[2]: 0x7fff2a93e510 = 0
[3]: 0x7fff2a93e518 = 0
[4]: 0x7fff2a93e520 = 0
[5]: 0x7fff2a93e528 = 0
[6]: 0x7fff2a93e530 = 0
[7]: 0x7fff2a93e538 = 0
Next p will be: 0x7fff2a93e700 (advance of 512 bytes)
"

Meaning that you're checking 64 bytes per iteration (for example from 0x500 to
0x508 is 8 bytes) while advancing by 512 bytes: you're missing 448 bytes to check
per iteration.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: schema variables
Next
From: "Tom Turelinckx"
Date:
Subject: Re: RISC-V animals sporadically produce weird memory-related failures